Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization methods Morten Nielsen Department of Systems biology, DTU IIB-INTECH, UNSAM, Argentina.

Similar presentations


Presentation on theme: "Optimization methods Morten Nielsen Department of Systems biology, DTU IIB-INTECH, UNSAM, Argentina."— Presentation transcript:

1 Optimization methods Morten Nielsen Department of Systems biology, DTU IIB-INTECH, UNSAM, Argentina

2 *Adapted from slides by Chen Kaeasar, Ben-Gurion University The path to the closest local minimum = local minimization Minimization

3 *Adapted from slides by Chen Kaeasar, Ben-Gurion University The path to the closest local minimum = local minimization Minimization

4 The path to the global minimum *Adapted from slides by Chen Kaeasar, Ben-Gurion University Minimization

5 Outline Optimization procedures –Gradient descent –Monte Carlo Overfitting –cross-validation Method evaluation

6 Linear methods. Error estimate I1I1 I2I2 w1w1 w2w2 Linear function o

7 Gradient descent (from wekipedia) Gradient descent is based on the observation that if the real-valued function F(x) is defined and differentiable in a neighborhood of a point a, then F(x) decreases fastest if one goes from a in the direction of the negative gradient of F at a. It follows that, if for  > 0 a small enough number, then F(b)<F(a)

8 Gradient descent (example)

9 Gradient descent

10 Weights are changed in the opposite direction of the gradient of the error

11 Gradient descent (Linear function) Weights are changed in the opposite direction of the gradient of the error I1I1 I2I2 w1w1 w2w2 Linear function o

12 Gradient descent Weights are changed in the opposite direction of the gradient of the error I1I1 I2I2 w1w1 w2w2 Linear function o

13 Gradient descent. Example Weights are changed in the opposite direction of the gradient of the error I1I1 I2I2 w1w1 w2w2 Linear function o

14 Gradient descent. Example Weights are changed in the opposite direction of the gradient of the error I1I1 I2I2 w1w1 w2w2 Linear function o

15 Gradient descent. Doing it your self Weights are changed in the opposite direction of the gradient of the error 10 W 1 =0.1W 2 =0.1 Linear function o What are the weights after 2 forward (calculate predictions) and backward (update weights) iterations with the given input, and has the error decrease (use  =0.1, and t=1)?

16 Fill out the table itrW1W2O 00.1 1 2 What are the weights after 2 forward/backward iterations with the given input, and has the error decrease (use  =0.1, t=1)? 10 W 1 =0.1W 2 =0.1 Linear function o

17 Fill out the table itrW1W2O 00.1 10.190.10.19 20.270.10.27 What are the weights after 2 forward/backward iterations with the given input, and has the error decrease (use  =0.1, t=1)? 10 W 1 =0.1W 2 =0.1 Linear function o

18 Monte Carlo Because of their reliance on repeated computation of random or pseudo-random numbers, Monte Carlo methods are most suited to calculation by a computer. Monte Carlo methods tend to be used when it is unfeasible or impossible to compute an exact result with a deterministic algorithm Or when you are too stupid to do the math yourself?

19 Example: Estimating Π by Independent Monte-Carlo Samples Suppose we throw darts randomly (and uniformly) at the square: Algorithm: For i=[1..ntrials] x = (random# in [0..r]) y = (random# in [0..r]) distance = sqrt (x^2 + y^2) if distance ≤ r hits++ End Output: Adapted from course slides by Craig Douglas http://www.chem.unl.edu/zeng/joy/m clab/mcintro.html

20 Estimating 

21 Monte Carlo (Minimization) dE<0dE>0

22 The Traveling Salesman Adapted from www.mpp.mpg.de/~caldwell/ss11/ExtraTS.pdf

23

24

25

26

27

28

29 Gibbs sampler. Monte Carlo simulations RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTIE RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTIE E1 = 5.4 E2 = 5.7 E2 = 5.2 dE>0; P accept =1 dE<0; 0 < P accept < 1 Note the sign. Maximization

30 Monte Carlo Temperature What is the Monte Carlo temperature? Say dE=-0.2, T=1 T=0.001

31 MC minimization

32 Monte Carlo - Examples Why a temperature?

33 Local minima

34 Stabilization matrix method

35 A prediction method contains a very large set of parameters –A matrix for predicting binding for 9meric peptides has 9x20=180 weights Over fitting is a problem Data driven method training years Temperature

36 Regression methods. The mathematics y = ax + b 2 parameter model Good description, poor fit y = ax 6 +bx 5 +cx 4 +dx 3 +ex 2 +fx+g 7 parameter model Poor description, good fit

37 Model over-fitting

38 Stabilization matrix method (Ridge regression). The mathematics y = ax + b 2 parameter model Good description, poor fit y = ax 6 +bx 5 +cx 4 +dx 3 +ex 2 +fx+g 7 parameter model Poor description, good fit

39 SMM training Evaluate on 600 MHC:peptide binding data L=0: PCC=0.70 L=0.1 PCC = 0.78

40 Stabilization matrix method. The analytic solution Each peptide is represented as 9*20 number (180) H is a stack of such vectors of 180 values t is the target value (the measured binding) is a parameter introduced to suppress the effect of noise in the experimental data and lower the effect of overfitting

41 SMM - Stabilization matrix method I1I1 I2I2 w1w1 w2w2 Linear function o Sum over weights Sum over data points

42 SMM - Stabilization matrix method I1I1 I2I2 w1w1 w2w2 Linear function o Per target error: Global error: Sum over weights Sum over data points

43 SMM - Stabilization matrix method Do it yourself I1I1 I2I2 w1w1 w2w2 Linear function o per target

44 SMM - Stabilization matrix method I1I1 I2I2 w1w1 w2w2 Linear function o per target

45 SMM - Stabilization matrix method I1I1 I2I2 w1w1 w2w2 Linear function o

46 SMM - Stabilization matrix method Monte Carlo I1I1 I2I2 w1w1 w2w2 Linear function o Global: Make random change to weights Calculate change in “global” error Update weights if MC move is accepted Note difference between MC and GD in the use of “global” versus “per target” error

47 Training/evaluation procedure Define method Select data Deal with data redundancy –In method (sequence weighting) –In data (Hobohm) Deal with over-fitting either –in method (SMM regulation term) or –in training (stop fitting on test set performance) Evaluate method using cross-validation

48 A small doit script //home/user1/bin/doit_ex #! /bin/tcsh foreach a ( `cat allelefile` ) mkdir -p $ cd $a foreach l ( 0 1 2.5 5 10 20 30 ) mkdir -p l.$l cd l.$l foreach n ( 0 1 2 3 4 ) smm -nc 500 -l $l train.$n > mat.$n pep2score -mat mat.$n eval.$n > eval.$n.pred end echo $a $l `cat eval.?.pred | grep -v "#" | gawk '{print $2,$3}' | xycorr` cd.. end cd.. end


Download ppt "Optimization methods Morten Nielsen Department of Systems biology, DTU IIB-INTECH, UNSAM, Argentina."

Similar presentations


Ads by Google