Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview WWW.IBP.IR پرتال WWW.IBP.IR پرتال بيوانفورماتيك ايرانيان.

Similar presentations


Presentation on theme: "1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview WWW.IBP.IR پرتال WWW.IBP.IR پرتال بيوانفورماتيك ايرانيان."— Presentation transcript:

1 1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview WWW.IBP.IR پرتال WWW.IBP.IR پرتال بيوانفورماتيك ايرانيان

2 2 Outline Introduction DNA Definitions Problems in bioinformatics Conclusion AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

3 3 Sciences reach a point where they become mathematized! “Leonard Adleman”

4 4 Computing Devices Computers → electronic components (transistors,…) Brains → biological components (neurons, …) Cells → biomolecular components (DNA,…) AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

5 5 DNA Deoxyribonucleic acid: DNA Four nucleotides (bases), or building blocks: A, T, G, C Zips itself up into helixes using base pairs: → A with T → G with C AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA is essentially digital

6 6 Bioinformatics Biomolecular computation → idea: use biomolecules and biochemical processes for solving computational problems Computational molecular biology → goal: understand/explain biomolecular systems and mechanisms AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

7 7 After going through an age of specialization, the sciences are now reuniting into a common mode of inquiry. “The next generation could produce a scientist in the old sense, a real generalist.” “Leonard Adleman”

8 8 Biomolecular Computation Idea: use biomolecules and biochemical processes for solving computational problems Start point: Leonard Adleman, 1994 → solving the Hamilton Path Problem using liquid- phase DNA chemistry Advantages: → fast → efficient in energy consumption → great storage capabilities AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

9 9 Computational Molecular Biology Goal: understand/explain biomolecular systems and mechanisms Application of computer technology to the management of biological information. Using Computers to gather, store, analyze and integrate biological and genetic information. AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Bioinformatics

10 10 Problems in Bioinformatics

11 11 Sequencing Genomes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC GAGGGAACACAGTCTGCACACTCCTTCCGATAT GAGGGAACACA GTCTGCACACT CCTTCCGATAT

12 12 Sequencing Genomes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC GAGGGAACACAGTCTGCACACTCCTTCCGATAT GAGGGAACACAGT AGTCTGCACACTC CTCCTTCCGATAT

13 13 Sequencing Genomes Concrete problem: Sequence assembly problem → given: fragments of large DNA sequence with overlaps (multiple coverage) → want: entire sequence Complicating factors → computational complexity: can be seen as a variation of shortest common superstring problem which is known to be NP-hard → incorrect/missing nucleotides in fragment data AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

14 14 Relation btw Organisms Concrete problem: Phylogenetic tree inference → given: homologous DNA sequence from multiple species → want: evolutionary tree relating these sequences Complicating factors → errors in sequence → complexity/quality of multiple sequence alignment → limited knowledge of evolutionary processes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

15 15 Sequence Alignment AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

16 16 DNA-Genes-Proteins Basic molecule of life: directly controls the fundamental biology of life Proteins determines the biological makeup of humans or any living organisms Variations and errors in the genomic DNA may lead to different diseases or disorders AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA → Genes → Proteins

17 17 DNA → Proteins AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA (gene) ↓ mRNA ↓ Protein

18 18 Computational Gene Finding Given: raw sequence data Predict: → coding and non-coding regions → exons/introns → splicing patterns → transcription factors AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Exon1Exon2Exon3Intron1Intron2Exon1Exon2Exon3 Pre mRNA mRNA

19 19 Structure Prediction RNA & Protein Minimum free energy RNA structure: → primary structure: Single stranded sequence of A, U, G, C → secondary structure: Intra-molecular base pairs among its bases AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

20 20 5’- GAGGGAACACAGUCUGCACACUCCUUC -3’ Secondary Structure

21 21 Arc Diagram Representation

22 22 Loops AAACUGCUGACCGGUAACUGAGGCCUGCCUGCAAUUGCUUAACUUGGC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Hairpin loopInterior loop Multi loopExternal loop Bulge loop Stacked pair

23 23 Pseudoknotted Structure 1 2 3 4 5 6 7 8 9 10 11 12 13

24 24 Str. Pred. Algorithms Dynamic programming algorithms → restricted class of pseudoknotted structures → Rivas and Eddy (R&E): O(N^6) Heuristic algorithms → search over the solution space AAACUGCUGACCGGUAACUGAGGCCUGCCUGCAAUUGCUUAACUUGGC

25 25 Motif Discovery AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

26 26 Genes and Diseases Proteins perform all of life’s essential functions Changes in DNA sequence genome can have disastrous consequences AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

27 27 Real World Applications AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

28 28 Related Aspects Computation models of organisms or biological systems Nature-inspired algorithms → genetic algorithms → neural networks → ant colony optimization Artificial life → life-like behavior of artificial systems → (re)-design or biological organisms AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

29 29 Conclusion Bioinformatics: Using computers for gathering, storing and analyzing biological data Analyzing AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

30 30 Thank you! AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC Baharak Rastegari, baharak@cs.ubc.cabaharak@cs.ubc.ca Bio Informatics

31 31 Genetic Process AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

32 32 Math and other sciences → Physics: time of Renaissance → Chemistry: after John Dalton developed atomic theory Introduction Sciences reach a point where they become mathematized! AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

33 33

34 34 DNA Gene expression? Two genes AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC DNA → Genes → Proteins

35 35 Genomic Sequence Data Interpretation Gene finding Structure prediction Pattern discovery Classification Clustering AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC

36 36 Understanding the Cell Concrete problem: Gene regulatory relationship inference → given: expression profiles of two genes A, B → want: decide if there is a (direct) regulatory relationship between A and B, and whether its activating or inhibiting one Complicating factors → imprecision/limitation in measuring expression profiles → indirect/complex regulatory relationship AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC


Download ppt "1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview WWW.IBP.IR پرتال WWW.IBP.IR پرتال بيوانفورماتيك ايرانيان."

Similar presentations


Ads by Google