Today’s Agenda Exam post-mortem (15-25 min) Grades & Status (5 min) Derek’s presentation (15-25 min) Exam #2: Question #1 (time permitting)
Exam post-mortem 1. Edit distance -3 if you made a minor (cascading) error -3 if you mis-initialized the matrix For the Final: Check your answer by hand. And Remember… GTCA A1 T2 G3
Exam post-mortem 2. Naïve Approach No approach can ever produce a solution that is better than optimal. No edit distance can be smaller than the optimal edit distance. Full credit given if noticed that matching first A led to a sub-optimal answer AGCATCT6 edits CATGCTA AGCATCT4 edits CATGCTA
Exam post-mortem 4. bits, binary code, 1’s and 0’s, etc 5. Deoxyribonucleic Acid 6. RNA Uracil, Guanine, Cytosine, Adenine 8. A gene is segment of DNA that encodes a protein or regulates a gene that does.
Exam post-mortem 9. Draw picture Acid Sugar A T Acid Sugar G Acid Sugar A Acid Sugar T Acid Sugar A Acid Sugar C Acid Sugar T
Exam post-mortem billion 11. Transcription Translation 12. Intron regions are sliced out (removed) 13. In reality, there are 20 amino acids in protein sequences 14. In theory, 3 RNA bases can encode 4 3 =64 different combinations 4 different RNA bases ACGU
Exam post-mortem Several times I mentioned that was not accurate. 16. Several hundred. Technically 100’s but 1000’s is OK
Exam post-mortem 18. Global alignment -> Whole genome comparison 19. Local alignment -> Searching for genes 20. Multiple alignment -> Shared pattern discovery -> gene discovery
Exam post-mortem 21. Finding the first two symbols that match requires: a. Finding two symbols that match such that b. The number of edits to match the two symbols is minimized TAAAC… CACTA… TAAAC… CACTA… TAAAC… CACTA… CGCTGGCC… AAATATTC… Matching the pair of A’s is the best options (so far) Some times the best match is deep in the sequences
Exam post-mortem 21. Finding the first two symbols that match : a. Is as hard (computationally) as solving the whole problem. b. What if there are no symbols that match c. What if the sequences have 100’s or 1000’s of different types of symbols
Exam post-mortem min = i+j; for (i = 0; i < n && i < min; i++) for (j = 0; j < n && j < min; j++) if ((seq1[i] == seq2[j]) &&((i+j) < min)) { min = i+j; mini = i; minj = j; } The first match occurs at seq1[mini] and seq2[minj].
Today’s Agenda Exam post-mortem (15-25 min) Grades & Status (5 min) Derek’s presentation (15-25 min) Exam #2: Question #1 (time permitting)
Grades & Status Exam # median average87.5
Grades & Status Current Ave. 97.6A 96.4A 95.2A 94.4A 93.0A median92.8A 90.4A- 89.8A- 84.6B 78.4C+ 74.0C average89.7
Grades & Status Things will get harder Right now, there is no work load… Soon, project #2 will be out The remaining material involves – Math (probability) – Algorithms – Code
Grades & Status Advice: Get moving with your paper and presentation. Get it out of the way. – Project #2 will be challenging… – Project #2 will be given out on Tuesday.
Today’s Agenda Exam post-mortem (15-25 min) Grades & Status (5 min) Derek’s presentation (15-25 min) Exam #2: Question #1 (time permitting)
Today’s Agenda Exam post-mortem (15-25 min) Grades & Status (5 min) Derek’s presentation (15-25 min) Exam #2: Question #1 (time permitting)
Exam #2: Question #1 Given two sequences, compute the optimal local alignment.