Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA SEQUENCE ALIGNMENT FOR PROTEIN SIMILARITY ANALYSIS CARL EBERLE, DANIEL MARTINEZ, MENGDI TAO.

Similar presentations


Presentation on theme: "DNA SEQUENCE ALIGNMENT FOR PROTEIN SIMILARITY ANALYSIS CARL EBERLE, DANIEL MARTINEZ, MENGDI TAO."— Presentation transcript:

1 DNA SEQUENCE ALIGNMENT FOR PROTEIN SIMILARITY ANALYSIS CARL EBERLE, DANIEL MARTINEZ, MENGDI TAO

2 SEQUENCE ALIGNMENT Bioinformatics techniques that seek to determine the evolutionary distance between two segments of DNA or AA rely on alignments Determining alignments can be done in many ways: Hamming Distance Optimality Criteria and Matrices Max/Min Parsimony Random Sequence Generation (Bootstrapping Analyisis)

3 BACKGROUND Dynamic Programming One of the first application of dynamic programming Memory Heavy Algorithmic approach to solutions (iterative) Needleman-Wunsch Global alignment Examines all amino acids and returns a solution based on total sequences Very inefficient No longer used when better solutions are available When they aren’t it is used for its simplicity

4 PROBLEM TO BE SOLVED Determine the optimal score of an amino acid sequence from nucleotide sequences retrieved from the NCBI database

5 DEMO

6 FLOWCHART AND OVERVIEW

7 SOLUTION 1.Decide on the gene in question and query the NCBI database for it 2.Take the returned search results and choose the correct result 3.Drill down into the result and extract the entire Gene record 4.Parse the record and extract the sequence in question 5.Search the NCBI database for the same gene in another species 6.Drill down into the result and extract the entire Gene record 7.Parse the record and extract the sequence in question

8 SOLUTION (CONTINUED) 8.Package the sequences in question in an easily manipulable form for further processing Biopython 9.Initialize the optimality matrix Needleman-Wunsch 10.Calculate the optimal alignment 11.Backtrack through the matrix to extract the aligned sequences 12.Generate the return strings to pass to the GUI 13.Display the results in a clear manner

9 EXAMPLE ALGORITHM

10 DETAILS AND POSSIBLE IMPROVEMENTS Blosum matrix choices Random Nucleotide matrix Correct calculation of similarity matrix Log-odds selection Manual gene name input in GUI More intelligent alignment scheme Needleman-Wunsch is not a great algorithm in terms of O(t), efficiency, or accuracy

11 SHORTCOMINGS & IMPROVEMENT One optimal alignment returned Similarity scores not changeable on the fly Sequence type not changeable on the fly Change the font, font size, color of the results Tag configure Extract table from HTML

12 THANK YOU! QUESTIONS?


Download ppt "DNA SEQUENCE ALIGNMENT FOR PROTEIN SIMILARITY ANALYSIS CARL EBERLE, DANIEL MARTINEZ, MENGDI TAO."

Similar presentations


Ads by Google