. Class 1: Introduction
The Tree of Life Source: Alberts et al
The Cell
Example: Tissues in Stomach
DNA Components Four nucleotide types: u Adenine u Guanine u Cytosine u Thymine Hydrogen bonds: u A-T u C-G
The Double Helix Source: Alberts et al
DNA Duplication Source: Mathews & van Holde
DNA Organization Source: Alberts et al
Genome Sizes u E.Coli (bacteria)4.6 x 10 6 bases u Yeast (simple fungi) 15 x 10 6 bases u Smallest human chromosome 50 x 10 6 bases u Entire human genome 3 x 10 9 bases
Genes The DNA strings include: u Coding regions (“genes”) l E. coli has ~4,000 genes l Yeast has ~6,000 genes l C. Elegans has ~13,000 genes l Humans have ~32,000 genes u Control regions l These typically are adjacent to the genes l They determine when a gene should be expressed u “Junk” DNA (unknown function)
Transcription u Coding sequences can be transcribed to RNA u RNA nucleotides: l Similar to DNA, slightly different backbone l Uracil (U) instead of Thymine (T) Source: Mathews & van Holde
RNA Editing
Source: Mathews & van Holde
RNA roles u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules and amino- acids (protein building blocks) u Ribosomal RNA (rRNA) l Part of the ribosome, a machine for translating mRNA to proteins u...
Transfer RNA Anticodon: u matches a codon (triplet of mRNA nucleotides) Attachment site: u matches a specific amino-acid
Translation u Translation is mediated by the ribosome u Ribosome is a complex of protein & rRNA molecules u The ribosome attaches to the mRNA at a translation initiation site u Then ribosome moves along the mRNA sequence and in the process constructs a poly-peptide u When the ribosome encounters a stop signal, it releases the mRNA. The construct poly-peptide is released, and folds into a protein.
Translation
Source: Alberts et al
Translation Source: Alberts et al
Translation Source: Alberts et al
Translation Source: Alberts et al
Translation Source: Alberts et al
Genetic Code
Protein Structure u Proteins are poly- peptides of amino-acids u This structure is (mostly) determined by the sequence of amino-acids that make up the protein
Protein Structure
Evolution u Related organisms have similar DNA l Similarity in sequences of proteins l Similarity in organization of genes along the chromosomes u Evolution plays a major role in biology l Many mechanisms are shared across a wide range of organisms l During the course of evolution existing components are adapted for new functions
Evolution Evolution of new organisms is driven by u Diversity l Different individuals carry different variants of the same basic blue print u Mutations l The DNA sequence can be changed due to single base changes, deletion/insertion of DNA segments, etc. u Selection bias
Course Goals u Computational tools in molecular biology u We will cover computational tasks that are posed by modern molecular biology u We will discuss the biological motivation and setup for these tasks u We will understand the the kinds of solutions exist and what principles justify them
Four Aspects Biological l What is the task? Algorithmic l How to perform the task at hand efficiently? Learning l How to adapt parameters of the task form examples Statistics l How to differentiate true phenomena from artifacts
Example: Sequence Comparison Biological l Evolution preserves sequences, thus similar genes might have similar function Algorithmic l Consider all ways to “align” one sequence against another Learning l How do we define “similar” sequences? Use examples to define similarity Statistics l When we compare to ~10 6 sequences, what is a random match and what is true one
Topics I Dealing with DNA/Protein sequences: u Genome projects and how sequences are found u Finding similar sequences u Models of sequences: Hidden Markov Models u Transcription regulation u Protein Families u Gene finding
Topics II Gene Expression: u Genome-wide expression patterns u Data organization: clustering u Reconstructing transcription regulation u Recognizing and classifying cancers
Topics III Models of genetic change: u Long term: evolutionary changes among species u Reconstructing evolutionary trees from current day sequences u Short term: genetic variations in a population u Finding genes by linkage and association
Topics IV Protein World: u How proteins fold - secondary & tertiary structure u How to predict protein folds from sequences data alone u How to analyze proteins changes from raw experimental measurements (MassSpec) u 2D gels
Class Structure u 2 weekly meeting l Class: Mondays l Targil: Tuesday Grade: u 60% in five question sets l Each contains theoretical problems & practical computer questions u 40% test u 5% bonus for active participation
Exercises & Handouts u Check regularly