Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phylogenetic Trees Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See

Similar presentations


Presentation on theme: "Phylogenetic Trees Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See"— Presentation transcript:

1 Phylogenetic Trees Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information. Sets and Dictionaries

2 Phylogenetic Trees Some organisms are more alike than others

3 Sets and DictionariesPhylogenetic Trees Some organisms are more alike than others

4 Sets and DictionariesPhylogenetic Trees Some organisms are more alike than others

5 Sets and DictionariesPhylogenetic Trees Some organisms are more alike than others

6 Sets and DictionariesPhylogenetic Trees Some organisms are more alike than others

7 Sets and DictionariesPhylogenetic Trees Some organisms are more alike than others ?

8 Sets and DictionariesPhylogenetic Trees Nothing in biology makes sense except in the light of evolution. — Theodosius Dobzhansky The closer their DNA, the more recently they had a common ancestor Reconstruct their evolutionary tree using a hierarchical clustering algorithm

9 Sets and DictionariesPhylogenetic Trees Calculate the common ancestor of the two closest

10 Sets and DictionariesPhylogenetic Trees Then of the next closest

11 Sets and DictionariesPhylogenetic Trees And the next

12 Sets and DictionariesPhylogenetic Trees Combine organisms with common ancestors

13 Sets and DictionariesPhylogenetic Trees And common ancestors with each other

14 Sets and DictionariesPhylogenetic Trees To construct a complete tree

15 Sets and DictionariesPhylogenetic Trees Redraw

16 Sets and DictionariesPhylogenetic Trees Redraw using height to show number of differences

17 Sets and DictionariesPhylogenetic Trees Turn this into an algorithm

18 Sets and DictionariesPhylogenetic Trees U = {all organisms} while U ≠ {}: a, b = two closest entries in U p = common parent of {a, b} U = U - {a, b} U = U + {p} Turn this into an algorithm

19 Sets and DictionariesPhylogenetic Trees U = {all organisms} while U ≠ {}: a, b = two closest entries in U p = common parent of {a, b} U = U - {a, b} U = U + {p} Ungrouped set shrinks by one element each time Turn this into an algorithm

20 Sets and DictionariesPhylogenetic Trees U = {all organisms} while U ≠ {}: a, b = two closest entries in U p = common parent of {a, b} U = U - {a, b} U = U + {p} Ungrouped set shrinks by one element each time Keep track of pairings on the side to draw tree later Turn this into an algorithm

21 Sets and DictionariesPhylogenetic Trees What does "closest" mean?

22 Sets and DictionariesPhylogenetic Trees What does "closest" mean? Simplest algorithm is unweighted pair-group method using arithmetic averages (UPGMA)

23 Sets and DictionariesPhylogenetic Trees What does "closest" mean? Simplest algorithm is unweighted pair-group method using arithmetic averages (UPGMA) humanvampirewerewolfmermaid human vampire13 werewolf56 mermaid121529

24 Sets and DictionariesPhylogenetic Trees Closest entries are human (H) and werewolf (W) HVWM H V13 W56 M121529

25 Sets and DictionariesPhylogenetic Trees Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) HVWM H V13 W56 M121529

26 Sets and DictionariesPhylogenetic Trees Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) Height is 1/2 value of entry HVWM H V13 W56 M121529

27 Sets and DictionariesPhylogenetic Trees Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) Height is 1/2 value of entry Replace score for X with (HX + WX - HW)/2 HVWM H V13 W56 M121529

28 Sets and DictionariesPhylogenetic Trees HVWM H V13 W56 M121529 Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) Height is 1/2 value of entry Replace score for X with (HX + WX - HW)/2

29 Sets and DictionariesPhylogenetic Trees HVWM H V13 W56 M121529 HWVM V7 M1815 Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) Height is 1/2 value of entry Replace score for X with (HX + WX - HW)/2

30 Sets and DictionariesPhylogenetic Trees HVWM H V13 W56 M121529 HWVM V7 M1815 HW HW 2.5 Closest entries are human (H) and werewolf (W) Replace with HW (common ancestor) Height is 1/2 value of entry Replace score for X with (HX + WX - HW)/2

31 Sets and DictionariesPhylogenetic Trees Repeat HWVM M13 HWVM V7 M1815 HW HW 3.5 V HWV

32 Sets and DictionariesPhylogenetic Trees Repeat HWVM M13 HWVM V7 M1815 HW HW 3.5 V HWV And again HWV M HW HW 6.5 V HWV M M13 M HWVM

33 Sets and DictionariesPhylogenetic Trees Final tree MVHWMVHW 10 9 8 7 6 5 4 3 2 1 0 HW HWV HWVM 6.5 2.5 3.5

34 Sets and DictionariesPhylogenetic Trees Final tree MVHWMVHW 10 9 8 7 6 5 4 3 2 1 0 HW HWV HWVM 1.0 3.0 Implied 6.5 2.5 3.5

35 Sets and DictionariesPhylogenetic Trees How to translate this into software?

36 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix…

37 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary

38 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary It's really just a lookup table…

39 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary It's really just a lookup table… …so we should think about using a dictionary

40 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary It's really just a lookup table… …so we should think about using a dictionary Key: (organism, organism)

41 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary It's really just a lookup table… …so we should think about using a dictionary Key: (organism, organism) –In alphabetical order to ensure uniqueness

42 Sets and DictionariesPhylogenetic Trees How to translate this into software? We drew it as a triangular matrix… …but the order of the rows and columns is arbitrary It's really just a lookup table… …so we should think about using a dictionary Key: (organism, organism) –In alphabetical order to ensure uniqueness Value: distance

43 Sets and DictionariesPhylogenetic Trees HVWM H V13 W56 M121529

44 Sets and DictionariesPhylogenetic Trees HVWM H V13 W56 M121529 { ('human', 'mermaid') : 12, ('human', 'vampire') : 13, ('human', 'werewolf') : 5, ('mermaid', 'vampire') : 15, ('mermaid', 'werewolf') : 29, ('vampire', 'werewolf') : 6 }

45 Sets and DictionariesPhylogenetic Trees Write out the algorithm

46 Sets and DictionariesPhylogenetic Trees while len(scores) > 0: min_pair = find_min_pair(species, scores) parent, height = create_new_parent(scores, min_pair) print parent, height old_score = remove_entries(species, scores, min_pair) update(species, scores, min_pair, parent, old_score) Write out the algorithm

47 Sets and DictionariesPhylogenetic Trees while len(scores) > 0: min_pair = find_min_pair(species, scores) parent, height = create_new_parent(scores, min_pair) print parent, height old_score = remove_entries(species, scores, min_pair) update(species, scores, min_pair, parent, old_score) Assumes scores are in a dictionary scores Write out the algorithm

48 Sets and DictionariesPhylogenetic Trees while len(scores) > 0: min_pair = find_min_pair(species, scores) parent, height = create_new_parent(scores, min_pair) print parent, height old_score = remove_entries(species, scores, min_pair) update(species, scores, min_pair, parent, old_score) Assumes scores are in a dictionary scores And species names are in a list species Write out the algorithm

49 Sets and DictionariesPhylogenetic Trees while len(scores) > 0: min_pair = find_min_pair(species, scores) parent, height = create_new_parent(scores, min_pair) print parent, height old_score = remove_entries(species, scores, min_pair) update(species, scores, min_pair, parent, old_score) Assumes scores are in a dictionary scores And species names are in a list species And yes, we revised this a couple of times… Write out the algorithm

50 Sets and DictionariesPhylogenetic Trees def find_min_pair(species, scores): '''Find minimum-value pair of species in scores.''' min_pair, min_val = None, None for pair in combos(species): assert pair in scores, \ 'Pair (%s, %s) not in scores' % pair if (min_val is None) or (scores[pair] < min_val): min_pair, min_val = pair, scores[pair] assert min_val is not None, \ 'No minimum value found in scores' return min_pair

51 Sets and DictionariesPhylogenetic Trees def find_min_pair(species, scores): '''Find minimum-value pair of species in scores.''' min_pair, min_val = None, None for pair in combos(species): assert pair in scores, \ 'Pair (%s, %s) not in scores' % pair if (min_val is None) or (scores[pair] < min_val): min_pair, min_val = pair, scores[pair] assert min_val is not None, \ 'No minimum value found in scores' return min_pair

52 Sets and DictionariesPhylogenetic Trees def find_min_pair(species, scores): '''Find minimum-value pair of species in scores.''' min_pair, min_val = None, None for pair in combos(species): assert pair in scores, \ 'Pair (%s, %s) not in scores' % pair if (min_val is None) or (scores[pair] < min_val): min_pair, min_val = pair, scores[pair] assert min_val is not None, \ 'No minimum value found in scores' return min_pair

53 Sets and DictionariesPhylogenetic Trees def combos(species): '''Generate all combinations of species.''' result = [] for i in range(len(species)): for j in range(i+1, len(species)): result.append((species[i], species[j])) return result

54 Sets and DictionariesPhylogenetic Trees def combos(species): '''Generate all combinations of species.''' result = [] for i in range(len(species)): for j in range(i+1, len(species)): result.append((species[i], species[j])) return result

55 Sets and DictionariesPhylogenetic Trees def combos(species): '''Generate all combinations of species.''' result = [] for i in range(len(species)): for j in range(i+1, len(species)): result.append((species[i], species[j])) return result def combos(): def find_min_pair(): if __name__ == '__main__':...main program...

56 Sets and DictionariesPhylogenetic Trees def create_new_parent(scores, pair): '''Create record for new parent.''' parent = '[%s %s]' % pair height = scores[pair] / 2. return parent, height

57 Sets and DictionariesPhylogenetic Trees def create_new_parent(scores, pair): '''Create record for new parent.''' parent = '[%s %s]' % pair height = scores[pair] / 2. return parent, height

58 Sets and DictionariesPhylogenetic Trees def create_new_parent(scores, pair): '''Create record for new parent.''' parent = '[%s %s]' % pair height = scores[pair] / 2. return parent, height def combos(): def find_min_pair(): def create_new_parent(): if __name__ == '__main__':...main program...

59 Sets and DictionariesPhylogenetic Trees def remove_entries(species, scores, pair): '''Remove species that have been combined.''' left, right = pair species.remove(left) species.remove(right) old_score = scores[pair] del scores[pair] return old_score

60 Sets and DictionariesPhylogenetic Trees def remove_entries(species, scores, pair): '''Remove species that have been combined.''' left, right = pair species.remove(left) species.remove(right) old_score = scores[pair] del scores[pair] return old_score

61 Sets and DictionariesPhylogenetic Trees def remove_entries(species, scores, pair): '''Remove species that have been combined.''' left, right = pair species.remove(left) species.remove(right) old_score = scores[pair] del scores[pair] return old_score def combos(): def find_min_pair(): def create_new_parent(): def remove_entries(): if __name__ == '__main__':...main program...

62 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

63 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

64 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

65 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

66 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

67 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

68 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort()

69 Sets and DictionariesPhylogenetic Trees def update(species, scores, pair, parent, parent_score): '''Replace two species from the scores table.''' left, right = pair for other in species: l_score = tidy_up(scores, left, other) r_score = tidy_up(scores, right, other) new_pair = make_pair(parent, other) new_score = (l_score + r_score - parent_score)/2. scores[new_pair] = new_score species.append(parent) species.sort() def combos(): def find_min_pair(): def create_new_parent(): def remove_entries(): def update(): if __name__ == '__main__':...main program...

70 Sets and DictionariesPhylogenetic Trees def tidy_up(scores, old, other): '''Clean out references to old species.''' pair = make_pair(old, other) score = scores[pair] del scores[pair] return score

71 Sets and DictionariesPhylogenetic Trees def tidy_up(scores, old, other): '''Clean out references to old species.''' pair = make_pair(old, other) score = scores[pair] del scores[pair] return score

72 Sets and DictionariesPhylogenetic Trees def tidy_up(scores, old, other): '''Clean out references to old species.''' pair = make_pair(old, other) score = scores[pair] del scores[pair] return score def make_pair(left, right): '''Make an ordered pair of species.''' if left < right: return (left, right) else: return (right, left)

73 Sets and DictionariesPhylogenetic Trees def tidy_up(scores, old, other): '''Clean out references to old species.''' pair = make_pair(old, other) score = scores[pair] del scores[pair] return score def make_pair(left, right): '''Make an ordered pair of species.''' if left < right: return (left, right) else: return (right, left) def combos(): def find_min_pair(): def create_new_parent(): def remove_entries(): def make_pair(): def tidy_up(): def update(): if __name__ == '__main__':...main program...

74 Sets and DictionariesPhylogenetic Trees $ python phylogen.py [human werewolf] 2.5 [[human werewolf] vampire] 3.5 [[[human werewolf] vampire] mermaid] 6.5 Exercise 1: write unit tests Exercise 2: reconstruct entire tree Exercise 3: why does update sort?

75 November 2010 created by Elango Cheran Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information.


Download ppt "Phylogenetic Trees Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See"

Similar presentations


Ads by Google