Download presentation
Presentation is loading. Please wait.
Published byAlexandra Howard Modified over 9 years ago
1
Topics in Bioinformatics CS832b Bin Ma
2
Lecture 1: Basic
3
Three molecules we will study DNA A string over alphabet {A,C,G,T} RNA Primary structure – a string over alphabet {A,C,G,U} Secondary and tertiary structures Protein Primary structure – a string over alphabet {A,R,N,D,C,Q,E,G,H,I,L,K,M,F,P,S,T,W,Y,V} Secondary and tertiary structures
6
5’ 3’
8
DNA 5’…AGTAGCCTATGCGA…3’ …::::::::::::::… 3’…TCATCGGATACGCT…5’ 5’…AGTAGCCTATGCGA…3’
9
>CHRX GATCACCTGACATCAGGAGTTCAAGACCAGCCTGCCAACGTGGTGAAACC CCATCTCTACTAAAAATAGGAAATTCACCTGGTGGCAGGTGCCTGTAATC CCAGCTACTCGGGAGGCTGAGGCAGAAGAATCGCTTGAACCCAGGAGGTG GAGATTGCACTGAGCTGAGATCACGCCACTGCGCTCCAGCCTGGGTGACA GAGCAAGACTCCATAAAAAAAAAAATTATAACCTAATGATTAAATACTGT AGGGAAGAGCTTACCACAATTGCTGGCCCATGGCCAATGCTGGGTATAAG ACAGCTACTGCAAACAACCATGATGATGATACATCTCTTGTGTAGGGTTA GGTTGTTTGAGACACATTCTATGCTCCTTGATTTGATTGGAAGGTACCTT GGTTCCTTGGGGACTTGGAGGTGACGAAAGCCTCCCTGGGGACAAAACTC ACCTTCACTTCTCTAATATCAAGCTTCAGCAACCTGCTCCAGCTACAGCA CAGGGTTGGACAGGCCCAACAACAGAGGAAATCCACAAAGTGTGTCTTGA CACATACATCCACGGGGTCTAACGAGGTGAGGCCAATGACTGCTTCCACA CACCCCAGCCAGACTCTGACTTCACTCCCGGCAGGTTTCAGTAGACTTGG CAGCAGTTGGAGCGAGCTGGCTTCTTGCGGTAGGCAGCCATGTTGGAAGA GCTCCCAATAGTCCTCGTTTCCTGGTAATCTCATGCTTGGATCATCTTCT TCTCTTGAGTGAAGAGAAGAACTGCAGAGAGAGACAGAGACAGAGAGACA GATCACAGGGGCAGTTTCCCCCATACTGTTCTCAAGATAAATGAGTCAAC TCTTACACCTCTTTTCTCTGGTGTAAAACAAGGCTGGTGAACAGGCAGAG AGAACTGGGGTGTTGGAGTAGCATTGACCTTCCTTCTTCATCCCTCTATA ATCTCTCCTAGTGCAGGAGTAGGAAAACTAAAAATCACACGTCTGATCAT CTGTGATCTCAGAGTCTTGGACAAGCCTTGCTTGCCAATCAGCAGGGATG GGAGTTGGAGCCATCTCCAAGTGTCCCCCCACAAATCTATGTCCACCTGG AAGTTTCAAATGCAACTTTATTTGGGAAAGGCAATTTTGCAAATGTTATT AAGTGAAGGATCTAGGGATGAGATCATCCTGGAGTAGGGTGGGTCCTAGG TCAAATGACAGGAAATCTGCCCACCTCGGCCTCCCAAAGTGCTGGGATTA CAGGCATGAGCCACCAAACCTGGCCTATCATTGATTTAATGATTAATACG GTTAGGCTCTGTGTCCCCACCCAAATCTCATCTCAAATTGTAATTCCCAT GTGTCCAGGGAGGGAGCTTGTGGAAGGTGATTGGATCACAGGGGCAGTTT TTGTCATGCTGTTCTCATGATAAATGAGTCAATTCTCAGAAGAGATGATG GTTTTAAAGTGTGGCACTTCTTTGCTCTCTTGCTCTCTCTCTCTCCTGAG TAGACTGGCTCATTCTTTCTACTGGTTACAAGCAATAGAAGTGATAACAA AATTGATGGTTTCTCATTTCCTAAATGGTACCAGTGGATTCCTGGTTTCC TCTCTCTCTCTTCTCTCTCTCTATCAACTTTTCCCTCAATCTCTCTATCA ACCTCCCTCTCTCTCAATCTCAATCTCTCTCAGTCTCATTCTCAATCTCT TTTGCTCAATCTCTTTCTCAGCTTCTCTCCCTCAATTTCTCTTTTGCAAC TTCTCTCTCTCAGTCTGTGTCTCTCAATCTCCCTCTCTCAATCTCTCTTG TAGTCTCCCTGTCTCTCATACTCTCTCTGTTTCTGTCTGTCTCTGCCCTT GCTCTAGGGAAAGCAAGTTCTTATGCTGTAAGTTCTCCTGTAAAAAGGTC CACATGATACGGAACTGGCCATCTTTGGCCAACATGAGTGAGTTTAGAAG TGTGCCTTTCACCAGTTGAGCCTTCAAATGAGATCCCAGCCCTGGATGAC ACAGTGACAGTAACCTGCTAGGAACTGTGAACCAGAGGCACCCAGCCAAG CTGCTCCCAGACTCCCAACCCAGTGAAACCATAAGATAATAAATGCATGT TGTTTTAAGCTGCTAAGTTTGGGGGTCACTTGTTACACAGCAACAGCTGA CTCATACATTTTCTTTGAAATTGATTTCCACTTCTGTCACCAGCATCATT CCATAAATTTGCTCTATGTGCATTGCTGACCTGCAGTAGAAGTTTTGGAG AAGTGAACCACATCCCCTTATCTGCCATTTGACAGCAAGCAGCCTCAAAC ATTCATAATTTCTTTCCTGACTCTCCACTCCACACTGTTGCCTGCCTTCC TGGTTCCAGATCTTTGGATCTGGACTGACACCTGGGCACTGTCATAGGCA TCCGTGTGAAGAGACCACCAACAGGCTCTGTGTGAGCAATAAAGCTTTTT AATCACCTGGGTGCAGGTGGGCTGATTCTGAAAAGAGAGTCAGCAAAGAG TGGTGGGATTATCATTAGTTCTTATAGGTTCGGGATAGGTGGTGGAGTTA GGAGCAATTTTTTGTGGGCAGGGAGTGGATCTTACAAAGGACATTCTCAA GGGTGGGGATGATTTTACAAAGTACCTTCTTAAGGGCGGGGGAGGATATT ACAAAGTACCTTCTCAAGGGTGGGGATGATTTTACAAAGTACCTTCTTAA GGGCGGGGGAGGATATTACAAAGTACCTTCTCAAGGGTGGGGGTGGATAT TACAAAGTACCTTCTTAAGGGCAGGGGAGGATATTACAAAGTACCTTCTC AAGGGGGGGGATGATTTTACAAAGTACCTTCTTAAGGGCGGGGGAGGATA TTACAAAGTACCTTCTCAAGGGTGGGGGTGGATATTAGAAAGTACCTTCT Chromosome X is one of the 23 chromosomes in human genome. Chromosome X has 162 million base pairs.
10
Genome Sizes SpeciesSize in bps Amoeba dubia670,000,000,000 Homo sapiens3,400,000,000 Drosophila melanogaster180,000,000 Mycoplasma genitalium580,000 Human immunodeficiency virus type 1 9,750
11
Protein and Amino Acids
12
Protein
13
GOT Ecoli
14
A protein sequence >gi|7228451|dbj|BAA92411.1| EST AU055734(S20025) corresponds to a region … MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST A protein sequence may have a few hundreds to several thousands amino acids.
15
RNA
16
Animal cell Nucleus Chromatin Mitochondrion Nucleolus (rRNA synthesized) Plasma membrane Cell coat Cytoplasm
17
Protein synthesis
19
Genetic code..ATTCACAGTGGA....ATTCACAGTGGA.. I H S G
20
Notes on translation Reading frame Start and end codon Third base not important 5’ -> 3’
21
DNA replication
22
The Central Dogma of Molecular Biology DNARNAProtein transcripttranslation replication genotype phenotype
23
Exception – retroviruses DNARNAProtein transcripttranslation replication genotype phenotype
24
Protein Phenotype DNA (Genotype) Biology
25
Genes One gene encodes one protein (or sometimes RNA). Like a program, it starts with start codon (e.g. ATG), then each three code one amino acid. Then a stop codon (e.g. TGA) signifies end of the gene. Genes are dense in prokaryotes and sparse in eukaryotes. In the middle of a eukaryotic gene, there are introns that are spliced out (as junk) after transcription. Good parts are called exons. This is the task of gene finding.
26
Introns and Exons
27
Jumping genes Genes can jump over other genes.
28
Gene related diseases Hemophilia: on X chromosome. Sickle-Cell Anemia: single nucleotide mutation in the first exon of beta-globin gene (removes a cutting site). 1 in 12 African Americans are carriers. (sick for homozygotes) BRCA1 gene (chr. 17q) – responsible for ½ inherited breast cancer (10% of breast cancer) Fragile X syndrome (mentally retard) – 1 in 1250 males, 2500 females (dominate, but females have partially expressed good gene). FMR-1 gene: tri-nucleotide repeats >200 causes disease. P53 gene: chr. 17p, responsible for ½ of all cancers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.