Bioinformatics Why Can’t It Tell Us Everything?
Bioinformatics What are our Data Sets? Interested in information flow with cells Currently, the key information is mostly a matter of biological macromolecules Eventually, information of interest will also include flow of nutrients, energy, and impact of small molecules on macromolecular function
Bioinformatics What are our Questions? What is in there? What does it do? How similar is it to something else? How does it fold? Where does it go in a cell? What does it interact with? How it is regulated? Level of confidence?
* Function of organism is determined by function of its cells * Function of cells determined by chemical reactions that take place within them * Chemical reactions occur or not according to presence and activity of enzymes * Enzymes are proteins * Proteins are determined by genes * Therefore, genes determine organismal function Bioinformatics Logical Reasoning Behind Data Sets
Genomics Proteomics
Central Dogma Flow of Information
Central Dogma DNA as the Blueprint for Life?
Central Dogma DNARNAProtein Genes & proteins are different molecular languages, but they are colinear
DNA Basic Unit (alphabet): Nucleotide (base) Only 4: A, T, G, and C Double-stranded: A<>T and G<>C 5’..AGCTGCATGCTAGCTGACGTCA….3’ 3’..TCGACGTACGATCGACTGCAGT….5’ “Words” (genes) to encode proteins, RNA Double helical
DNA Tower in Perth, AUS DNA Structure Connected to Information
DNA Replication & Transcription as Algorithms With rare exceptions, all DNA is replicated Crucial tool is ability to go from one strand to another Transcription uses same base-pairing rules with U instead of T, but occurs in packets
Transcription = DNA to RNA Where to Start is a Big Question
Protein Alphabet: amino acids There are 20 amino acids MetCysSerLeuAla Val
Proteins Number of Possible 100-mer Peptides? 20 possible residues at each position For 2-mers, 20 possible at position 1 and 20 possible at position 2, so 20 x 20 = 20 2 = 400 Same logic for 100-mers, = x = (2 10 ) 10 x = ~ (10 3 ) 10 x =
beta-pleated sheet Proteins Folding Starts Local alpha-helix
Proteins Folding Goes Global
Proteins Predictive Protein Folding as Holy Grail
Protein Alphabet: amino acids There are 20 amino acids Encoded by codons (triplets of nucleotides) MetCys ATGTGCAGCCTAGCTGCCGTC Ser CTAGCTGCCGTC LeuAla Val
Genetic Code Found on Earth: How Does It Work? 5’-UCGACCAUGGUUGACCAUUGAUUACCACG-3’
Genetic Code Triplet Nonoverlapping Comma-less Redundant
Bioinformatics: Mining a Mountain of Data Where are the putative genes?