CSE-700 Parallel Programming Assignment 6 POSTECH Oct 19, 2007 박성우
2 Species and Sequences Sequence 1 Species Sequence 2 Sequence n...
3 Ortholog Last Common Ancestor S Human S1 Dog S2 By speciation
4 Paralog Human S S1S1' By duplication
5 Inparalog Last Common Ancestor S Human S1 Chimpanzee S2 By speciation S1' By duplication
6 Paralog - Outparalog LCA HumanDog LCA = Last Common Ancestor SS'S1S1'S2S2'
7 Coortholog S1' Species A S1 Species B S2 S2'
8 Input Assume a total of n species S1, S2,..., Sn For each pair of species {Si, Sj} –Ortholog and paralog relations Thus n(n + 1)/2 ortholog/paralog files
9 Seed Ortholog Species A Si Species B Sj 1.0 Cluster
10 Invariant: No Two Seed Orthologs for Any Sequence Species A Si Species B Sj 1.0 Sk 1.0
11 Ortholog and Paralogs Species A Si Species B Sj 1.0 Cluster Si'
12 Output Assume a total of n species S1, S2,..., Sn Ortholog and paralog relations among all these species In each cluster, –seed ortholog from each pair of species –paralogs may be included.
13 Example of Cluster [1] A S1'S1 B S2S2' C S3S3' D S4'S4
14 Example of Cluster [2] A S1'S1 B S2S2' C S3S3' D S4'S4
15 Bad Clusters [1] A S1'S1 B S2S2' C S3S3' D S4'S4 E S5'S5
16 Bad Clusters [2] C S3 D S4'S4 E S6'S6 S4'' S5
17 Input File Format Each line consists of: –Cluster number –Similarity score –Species name –Seed ortholog –Sequence name
18 Goal Implement ANY sequential algorithm –There is no definitive answer. Then parallelize it. A parser and an output module are provided. –no string comparion –all integer operations