Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.

Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey

What is Bioinformatics? Algorithms to analyze DNA, RNA, or protein sequences Database searches to find homologous sequences Construction of evolutionary trees Structure prediction Human Genome Project

Why use Bioinformatics in an Algorithms Course? Real-life applications of algorithms Variety of string processing algorithms Use of similarity instead of exact matching Dynamic programming examples Theory vs. Practice Issues

Models for Incorporating Bioinformatics Infusion – include material from bioinformatics in computer science courses Paired Courses – have joint lectures and projects from, e.g., Algorithms and Genetics courses Tracked Courses – have a separate Algorithms for Bioinformatics course

Biology Basics Primary DNA structure – Oriented character string Double strand constructed through base pairing Central Dogma – Information passes in one direction, from DNA to RNA to protein Amino acids formed from triples of bases, called codons

Bonding along a strand

Bonding between strands

Complexity of DNA Problems 3 billion base pairs in human genome Many NP complete problems 10 600 possible alignments for two 1000 character sequences

Sequence Alignment Determine the alignment of two sequences that maximizes similarity (global alignment) Determine substrings of two sequences with maximum similarity (local alignment) Determine the alignment for several sequences that maximizes the sum of pairs similarity (multiple alignment)

Edit Operations AATAAGC ATTAAGC AAT-AAGC AATTAAGC AATAAGC AA-AAGC SubstitutionInsertionDeletion

Dynamic Programming Alignment Algorithm (Needleman-Wunsch) Match a i+1 with b j+1 Match a i+1 with a space — Match b j+1 with a space — If a 1,a 2,…,a i and b 1,b 2,…,b j have been aligned, there are three possible next moves: Choose the move that maximizes the similarity of the two sequences

Alignment Scoring System +1 for a character match -1 for a mismatch (substitution) -2 for using a space (indel) or a + b·k for a gap of k spaces (affine gap penalty)

Global Alignment Matrix —GGACA —0-2-4-6-8-10 G-21-3-5-7 G-420-2-4 G-6-301-3 C-8-5-220 A-10-7-403 T-12-9-6-3-21

Optimal Alignment GGGCAT GGACA—

Other Bioinformatics Algorithms Palindromes Tandem Repeats Longest Common Subsequence Double Digest (NP complete) Shortest Common Superstring (NP complete)

References Clote and Backofen, Computational Molecular Biology, Wiley Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge University Press Mount, Bioinformatics, Cold Spring Harbor Press Setubal and Meidanis, Introduction to Computational Molecular Biology, PWS Waterman, Introduction to Computational Biology, CRC Press

Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.

Similar presentations

Presentation on theme: "Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.

Similar presentations

Presentation on theme: "Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey."— Presentation transcript:

Similar presentations

About project

Feedback