Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis.

Slides:



Advertisements
Similar presentations
DNA Structure and Function
Advertisements

The Organization of Cellular Genomes Complexity of Genomes Chromosomes and Chromatin Sequences of Genomes Bioinformatics As we have discussed for the last.
Gene Linkage Heredity Part 3.
Copyright © 2013 Pearson Education, Inc. All rights reserved. Exploring Biological Anthropology: The Essentials, 3 rd Edition CRAIG STANFORD JOHN S. ALLEN.
Bioinformatics What is bioinformatics? Why bioinformatics? The major molecular biology facts Brief history of bioinformatics Typical problems of bioinformatics:
Genomics An introduction. Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Chapter 3 The Biological Basis of Life. Introduction Genetics is the study of how one trait transfers from one generation to the next Involves process.
BIO513: Lecture 1. Central dogma “The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Alternative splicing and evolution Daniel Jeffares.
10 Genomics, Proteomics and Genetic Engineering. 2 Genomics and Proteomics The field of genomics deals with the DNA sequence, organization, function,
Introduction to Genetics
Chapter 1 The Science of Genetics
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
MCB 7200: Molecular Biology
Molecular genetics of gene expression Mat Halter and Neal Stewart 2014.
Elements of Molecular Biology All living things are made of cells All living things are made of cells Prokaryote, Eukaryote Prokaryote, Eukaryote.
Unit 4 Vocabulary Review. Nucleic Acids Organic molecules that serve as the blueprint for proteins and, through the action of proteins, for all cellular.
Human Genetics The Human Genome 1.
CHMI E.R. Gauthier, Ph.D. 1 CHMI 2227E Biochemistry I Gene expression.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Genetics Ms Mahoney MCAS Biology. Central Concepts: Genes allow for the storage and transmission of genetic information. They are a set of instructions.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Molecular Biology Fourth Edition
© 2015 W. H. Freeman and Company CHAPTER 1 The Genetics Revolution Introduction to Genetic Analysis ELEVENTH EDITION Introduction to Genetic Analysis ELEVENTH.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
MCB 7200: Molecular Biology Biotechnology terminology Common hosts and experimental organisms Transcription and translation Prokaryotic gene organization.
Brief Overview of Macromolecules DNA, RNA, and Proteins.
Topics in Bioinformatics CS832b Bin Ma. Lecture 1: Basic.
Review 4: Heredity & Molecular Genetics AP Biology.
Microbiology Chapter 9 Genetics - Science of the study of heredity, variations in organisms that are transferable from generations to generation DNA is.
11 Gene function: genes in action. Sea in the blood Various kinds of haemoglobin are found in red blood cells. Each kind of haemoglobin consists of four.
Alberts • Bray • Hopkin • Johnson • Lewis • Raff • Roberts • Walter
Notes: Human Genome (Right side page)
1 From Bi 150 Lecture 0 October 4, 2012 An introduction to molecular biology... but you will learn the cell biology in this course.
Semester II Final Biology I Campbell. Significance of Final Exams 1. Final Exams are worth 10-20% of your semester grade. 2. The semester grade includes.
1 Genes and Proteins The genetic information contained in the nucleotide sequence of DNA specifies a particular type of protein Enzymes = proteins that.
Molecular Biology Fourth Edition Chapter 1 A Brief History Lecture PowerPoint to accompany Robert F. Weaver Copyright © The McGraw-Hill Companies, Inc.
WARM UP List anything and everything you know about chromosomes, mitosis, meiosis, or cell division.
EQTLs.
MCB 7200: Molecular Biology
Things that may help with comprehension of bioinformatics issues in general and Rosalind problems in particular.
Chapter 5 The Content of the Genome
PBIO 4500/5500: Biotechnology and Genetic Engineering
An introduction to molecular biology
Genomes and Their Evolution
Relationship between Genotype and Phenotype
MICROBIAL GENETICS CHAPTER 7.
Genomes and Their Evolution
There are four levels of structure in proteins
Synthetic Biology: Protein Synthesis
A Brief History What is molecular biology?
UNIT 5 Protein Synthesis.
Homework #4 is due 12/4/07 (only if needed)
Relationship between Genotype and Phenotype
Sexual reproduction creates unique combinations of genes.
Molecular Biology Fourth Edition
The Content of the Genome
Presentation transcript:

Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis and structure RNA synthesis and processing DNA replication Basics of transmission genetics Note: many of the figures used in this presentation are copyrighted. Most are taken from "Genetics: From Genes to Genomes" by Hartwell and colleagues (McGraw Hill)

Biology for bioinformatics: Alignment of pairs of sequences Multiple sequence alignment Prediction of RNA secondary structure Phylogenetic prediction Database searching for sequences Gene prediction Analysis of microarray expression data Protein classification Protein folding / structure prediction Genome analysis / databases Genetic variation (haplotypes and allelic association)

What is it about DNA that allows it to carry information?

DNA polymerase Alberts et al. Fig. 6-36

Molecular genetics: genes as information DNA -> RNA -> protein. DNA is digital information. Each nucleotide carries 2 bits of information. Implications Low-error propagation. Complete representation in digital databases. Aquisition of genetic information is the raw fuel behind the explosion of bioinformatics

Clelland et al. Nature 399:533. Hiding messages in DNA microdots.

"For it is not cell nuclei, not even individual chromosomes, but certain parts of certain chromsomes from certain cells that must be isolated and collected in enormous quantities for analysis; that would be the precondition for placing the chemist in such a position as would allow him to analyze [the hereditary material] more minutely than the morphologists." - Theodor Boveri 1904 If the information in DNA is contained in single molecules, how can we know about it? We reduce the complexity of the DNA by amplification and use the power of complementarity to detect specific sequences by hybridization. Determination of the chromosomal location of TGx in the human genome by fluorescent in situ hybridization. (from Daniel Aeschlimann's web site (Univ. of Wales)

from Konstantin V. Krutovskii and David B. Neale 2001 "Forest Genomics for Conserving Adaptive Genetic Diversity" Microarrays Array Scan Visualize Analyze

Photolithographic arrays (Affymetrix) from Each spot has an oligo with a distinct sequence

Homologous proteins conserve elements of genetic information (sequence).

New gene functions can arise from pre-existing gene functions

Related genes retain sequence similarity.

DNA to RNA to protein to phenotype Proteins: enzymes alkaptonuria phenylketonuria phenylalanine buildup in the brain can cause mental retardation

DNA to RNA to protein to phenotype Proteins: regulators

DNA to RNA to protein to phenotype Structural proteins Ehlers-Danlos syndrome (joint hypermobility) is one of the phenotypes associated with mutations in genes encoding collagen.

DNA to RNA to protein to phenotype Proteins What do they do? see

DNA to RNA to protein to phenotype

DNA to RNA to protein

DNA to RNA to protein to phenotype

protein Hydrogen bonds within the protein and the rigidity of the peptide bond are critical determinants of protein structure.

DNA to RNA to protein to phenotype Molecular Biology of the Cell Figure 3-30  -helix

DNA to RNA to protein to phenotype Molecular Biology of the Cell Figure 3-29 ß-sheet

NCBI provides information about proteins

GenBank flat file format for HA oxidase

GenBank fasta file format for HA oxidase

Links to other information about HA oxidase

The HA oxidase gene and its flanking region on chromosome 3q21

OMIM: Alkaptonuria is caused by mutations in HA oxidase

Conserved Domains

Three-dimensional structure of the protein, if known, can be viewed.

Lectures 8 and 35 will cover types of mutation in detail

Gene density in selected genomes SpeciesGenome size Gene #Ave. Size (Mb.) Eschericia coli4.74, kb. Saccharomyces cerevisiae12.16, kb. C. elegans9716, kb. Arabidopsis11525, kb. Drosophila melanogaster120 13, kb. Homo sapiens3,20075,000 ?40.0 kb. 30, kb. CDS (coding sequence) sizes do not vary much at all, between 1.3 and 1.5 kb.

What's in the genome besides genes: introns

What's in the genome besides genes: remote regulatory DNA

DNA to RNA to protein to phenotype Lecture 14 will cover transcription in detail

DNA to RNA to protein to phenotype

DNA to RNA to protein

DNA to RNA

DNA to RNA to protein

DNA must be maintained. Natural processes can degrade the information in the DNA

Molecular Biology of the Cell, third edition, panel 1-1 Cells and organelles

4C 46 chromosomes each with 2 duplexes 4C 92 chromosomes 2C 46 chromosomes per cell

Mitosis: heterozygosity is maintained

Meiosis results in new combinations of alleles

Mendel's laws of segregation and independent assortment come from meiosis

A A a a a a a a A A A A B B b b b b b b B B B B

Recombination A A a a B b B b A A B b A A B b a a B b a a B b

Measuring rates of recombination.

Formal definition of linkage disequilibrium If two loci have alleles A 1, A 2 with frequencies p 1, p 2 and B 1, B 2 with frequencies q 1, q 2, there are four possible haplotypes (A 1 B 1, A 1 B 2, A 2 B 1, and A 2 B 2 ). Let these frequencies be f 1,1, f 1,2, f 2,1, f 2.2. If there is no linkage disequilibrium, then f 1,1 = p 1 q 1, f 1,2 = p 1 q 2, and so on. There are a number of measures of linkage disequilibrium. One of them is D = f 1,1 f f 1,2 f 2.1.

Interpreting allelic association The general case is described by an isolated population that has high frequencies (p and r respectively) of both a disease-causing allele D 1 and an unlinked marker M 1. The descendents of people who move from that population to a second population with different frequencies will show association between D 1 and M 1 even though they are not linked. p =.02, r =.5 p =.0001 r =.1 The disease-causing allele is at a high frequency in a small village. Affected people in a nearby city are more likely to have other alleles, such as M 1, that are found in elevated frequencies in that village merely because they have ancestors from that village.

Biology for bioinformatics: Alignment of pairs of sequences Multiple sequence alignment Prediction of RNA secondary structure Phylogenetic prediction Database searching for sequences Gene prediction Analysis of microarray expression data Protein classification Protein folding / structure prediction Genome analysis / databases Genetic variation (haplotypes and allelic association)

Next time: more about the status of those problems and current state of the art methods. Tutorial II: Monday, May 10, 2118 CSIC, 2:00 - 3:45