Download presentation
Presentation is loading. Please wait.
Published byFelicia Barnett Modified over 9 years ago
1
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview WWW.IBP.IR پرتال WWW.IBP.IR پرتال بيوانفورماتيك ايرانيان
2
2 Outline Introduction DNA Definitions Problems in bioinformatics Conclusion AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
3
3 Sciences reach a point where they become mathematized! “Leonard Adleman”
4
4 Computing Devices Computers → electronic components (transistors,…) Brains → biological components (neurons, …) Cells → biomolecular components (DNA,…) AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
5
5 DNA Deoxyribonucleic acid: DNA Four nucleotides (bases), or building blocks: A, T, G, C Zips itself up into helixes using base pairs: → A with T → G with C AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA is essentially digital
6
6 Bioinformatics Biomolecular computation → idea: use biomolecules and biochemical processes for solving computational problems Computational molecular biology → goal: understand/explain biomolecular systems and mechanisms AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
7
7 After going through an age of specialization, the sciences are now reuniting into a common mode of inquiry. “The next generation could produce a scientist in the old sense, a real generalist.” “Leonard Adleman”
8
8 Biomolecular Computation Idea: use biomolecules and biochemical processes for solving computational problems Start point: Leonard Adleman, 1994 → solving the Hamilton Path Problem using liquid- phase DNA chemistry Advantages: → fast → efficient in energy consumption → great storage capabilities AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
9
9 Computational Molecular Biology Goal: understand/explain biomolecular systems and mechanisms Application of computer technology to the management of biological information. Using Computers to gather, store, analyze and integrate biological and genetic information. AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Bioinformatics
10
10 Problems in Bioinformatics
11
11 Sequencing Genomes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC GAGGGAACACAGTCTGCACACTCCTTCCGATAT GAGGGAACACA GTCTGCACACT CCTTCCGATAT
12
12 Sequencing Genomes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC GAGGGAACACAGTCTGCACACTCCTTCCGATAT GAGGGAACACAGT AGTCTGCACACTC CTCCTTCCGATAT
13
13 Sequencing Genomes Concrete problem: Sequence assembly problem → given: fragments of large DNA sequence with overlaps (multiple coverage) → want: entire sequence Complicating factors → computational complexity: can be seen as a variation of shortest common superstring problem which is known to be NP-hard → incorrect/missing nucleotides in fragment data AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
14
14 Relation btw Organisms Concrete problem: Phylogenetic tree inference → given: homologous DNA sequence from multiple species → want: evolutionary tree relating these sequences Complicating factors → errors in sequence → complexity/quality of multiple sequence alignment → limited knowledge of evolutionary processes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
15
15 Sequence Alignment AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
16
16 DNA-Genes-Proteins Basic molecule of life: directly controls the fundamental biology of life Proteins determines the biological makeup of humans or any living organisms Variations and errors in the genomic DNA may lead to different diseases or disorders AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA → Genes → Proteins
17
17 DNA → Proteins AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA (gene) ↓ mRNA ↓ Protein
18
18 Computational Gene Finding Given: raw sequence data Predict: → coding and non-coding regions → exons/introns → splicing patterns → transcription factors AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Exon1Exon2Exon3Intron1Intron2Exon1Exon2Exon3 Pre mRNA mRNA
19
19 Structure Prediction RNA & Protein Minimum free energy RNA structure: → primary structure: Single stranded sequence of A, U, G, C → secondary structure: Intra-molecular base pairs among its bases AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
20
20 5’- GAGGGAACACAGUCUGCACACUCCUUC -3’ Secondary Structure
21
21 Arc Diagram Representation
22
22 Loops AAACUGCUGACCGGUAACUGAGGCCUGCCUGCAAUUGCUUAACUUGGC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Hairpin loopInterior loop Multi loopExternal loop Bulge loop Stacked pair
23
23 Pseudoknotted Structure 1 2 3 4 5 6 7 8 9 10 11 12 13
24
24 Str. Pred. Algorithms Dynamic programming algorithms → restricted class of pseudoknotted structures → Rivas and Eddy (R&E): O(N^6) Heuristic algorithms → search over the solution space AAACUGCUGACCGGUAACUGAGGCCUGCCUGCAAUUGCUUAACUUGGC
25
25 Motif Discovery AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
26
26 Genes and Diseases Proteins perform all of life’s essential functions Changes in DNA sequence genome can have disastrous consequences AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
27
27 Real World Applications AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
28
28 Related Aspects Computation models of organisms or biological systems Nature-inspired algorithms → genetic algorithms → neural networks → ant colony optimization Artificial life → life-like behavior of artificial systems → (re)-design or biological organisms AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
29
29 Conclusion Bioinformatics: Using computers for gathering, storing and analyzing biological data Analyzing AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
30
30 Thank you! AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Baharak Rastegari, baharak@cs.ubc.cabaharak@cs.ubc.ca Bio Informatics
31
31 Genetic Process AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
32
32 Math and other sciences → Physics: time of Renaissance → Chemistry: after John Dalton developed atomic theory Introduction Sciences reach a point where they become mathematized! AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
33
33
34
34 DNA Gene expression? Two genes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA → Genes → Proteins
35
35 Genomic Sequence Data Interpretation Gene finding Structure prediction Pattern discovery Classification Clustering AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
36
36 Understanding the Cell Concrete problem: Gene regulatory relationship inference → given: expression profiles of two genes A, B → want: decide if there is a (direct) regulatory relationship between A and B, and whether its activating or inhibiting one Complicating factors → imprecision/limitation in measuring expression profiles → indirect/complex regulatory relationship AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.