Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probability estimation and weights1 Weighting training sequences Why do we want to weight training sequences? Many different proposals – Based on trees.

Similar presentations


Presentation on theme: "Probability estimation and weights1 Weighting training sequences Why do we want to weight training sequences? Many different proposals – Based on trees."— Presentation transcript:

1 Probability estimation and weights1 Weighting training sequences Why do we want to weight training sequences? Many different proposals – Based on trees – Based on the 3D position of the sequences – Interested only in classifying family membership – Maximizing entropy

2 Probability estimation and weights2 Why do we want to weight training sequences? Parts of sequences can be closely related to each other and don’t deserve the same influence in the estimation process as a sequence which is highly diverted. – Phylogenetic trees – Sequences AGAA, CCTC, AGTC AGTC AGAACCTC

3 Probability estimation and weights3 Weighting schemes based on trees Thompson, Higgins & Gibson (1994) (Represents electric currents as calculated by Kirchhoff’s laws) Gerstein, Sonnhammer & Chothia (1994) Root weights from Gaussian parameters (Altschul-Caroll-Lipman weights for a three-leaf tree 1989)

4 Probability estimation and weights4 Thompson, Higgins & Gibson 123 Electric network of voltages, currents and resistances

5 Probability estimation and weights5 Thompson, Higgins & Gibson 123

6 Probability estimation and weights6 Gerstein, Sonnhammer & Chothia Works up the tree, incrementing the weights – Initially: weights are set to the edge lengths (resistances in previous example)

7 Probability estimation and weights7 Gerstein, Sonnhammer & Chothia 123 1 2 0

8 Probability estimation and weights8 Gerstein, Sonnhammer & Chothia Small difference with Thompson, Higgins & Gibson? 12

9 Probability estimation and weights9 Root weights from Gaussian parameters Continuous in stead of discrete members of an alphabet Probability density in stead of a substitution matrix Example: Gaussian

10 Probability estimation and weights10 Root weights from Gaussian parameters

11 Probability estimation and weights11 Root weights from Gaussian parameters Altschul-Caroll-Lipman weights for a tree with three leaves

12 Probability estimation and weights12 Root weights from Gaussian parameters 123

13 Probability estimation and weights13 Weighting schemes based on trees Thompson, Higgins & Gibson (Electric current): 1:1:2 Gerstein, Sonnhammer & Chothia: 7:7:8 Altschul-Caroll-Lipman weights for a tree with three leaves: 1:1:2

14 Probability estimation and weights14 Weighting scheme using ‘sequence space’ Voronoi weights = =

15 Probability estimation and weights15 More weighting schemes Maximum discrimination weights Maximum entropy weights – Based on averaging – Based on maximum ‘uniformity’ (entropy)

16 Probability estimation and weights16 Maximum discrimination weights Does not try to maximize likelihood or posterior probability It decides whether a sequence is a member of a family

17 Probability estimation and weights17 Maximum discrimination weights Discrimination D Maximize D, emphasis is on distant or difficult members

18 Probability estimation and weights18 Maximum discrimination weights Differences with previous systems – Iterative method Initial weights give rise to a model New calculated posterior probabilities P(M|x) gives rise to new weights and hence a new model until convergence is reached – It optimizes performance for that what the model is designed for : classifying whether a sequence is a member of a family

19 Probability estimation and weights19 More weighting schemes Maximum discrimination weights Maximum entropy weights – Based on averaging – Based on maximum ‘uniformity’ (entropy)

20 Probability estimation and weights20 Maximum entropy weights Entropy = A measure of the average uncertainty of an outcome (maximum when we are maximally uncertain about the outcome) Averaging:

21 Probability estimation and weights21 Maximum entropy weights Sequences AGAA CCTC AGTC

22 Probability estimation and weights22 Maximum entropy weights ‘Uniformity’:

23 Probability estimation and weights23 Maximum entropy weights Sequences AGAA CCTC AGTC

24 Probability estimation and weights24 Maximum entropy weights Solving the equations leads to:

25 Probability estimation and weights25 Summary of the entropy methods Maximum entropy weights (avaraging) Maximum entropy weights (‘uniformity’)

26 Probability estimation and weights26 Conclusion Many different methods Which one to use depends on problem Questions??


Download ppt "Probability estimation and weights1 Weighting training sequences Why do we want to weight training sequences? Many different proposals – Based on trees."

Similar presentations


Ads by Google