Presentation is loading. Please wait.

Presentation is loading. Please wait.

ETHEM ALPAYDIN © The MIT Press, 2010 Lecture Slides for.

Similar presentations


Presentation on theme: "ETHEM ALPAYDIN © The MIT Press, 2010 Lecture Slides for."— Presentation transcript:

1 ETHEM ALPAYDIN © The MIT Press, 2010 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml2e Lecture Slides for

2

3 Neural Networks Networks of processing units (neurons) with connections (synapses) between them Large number of neurons: 10 10 Large connectitivity: 10 5 Parallel processing Distributed computation/memory Robust to noise, failures 3Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

4 Understanding the Brain Levels of analysis (Marr, 1982) 1. Computational theory 2. Representation and algorithm 3. Hardware implementation Reverse engineering: From hardware to theory Parallel processing: SIMD vs MIMD Neural net: SIMD with modifiable local memory Learning: Update by training/experience 4Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

5 Perceptron 5 (Rosenblatt, 1962) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

6 What a Perceptron Does Regression: y=wx+w 0 Classification: y=1(wx+w 0 >0) 6 w w0w0 y x x 0 =+1 w w0w0 y x s w0w0 y x Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

7 K Outputs 7 Classification : Regression : Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

8 Training 8 Online (instances seen one by one) vs batch (whole sample) learning: No need to store the whole sample Problem may change in time Wear and degradation in system components Stochastic gradient-descent: Update after a single pattern Generic update rule (LMS rule): Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

9 Training a Perceptron: Regression Regression (Linear output): 9 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

10 Single sigmoid output K>2 softmax outputs Classification 10Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

11 Learning Boolean AND 11Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

12 XOR No w 0, w 1, w 2 satisfy: 12 (Minsky and Papert, 1969) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

13 Multilayer Perceptrons 13 (Rumelhart et al., 1986) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

14 14 x 1 XOR x 2 = (x 1 AND ~x 2 ) OR (~x 1 AND x 2 ) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

15 Backpropagation 15Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

16 16 Regression Forward Backward x Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

17 Regression with Multiple Outputs 17 zhzh v ih yiyi xjxj w hj Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

18 18Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

19 19Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

20 20 whx+w0whx+w0 zhzh vhzhvhzh Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

21 One sigmoid output y t for P(C 1 |x t ) and P(C 2 |x t ) ≡ 1-y t Two-Class Discrimination 21Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

22 K>2 Classes 22Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

23 MLP with one hidden layer is a universal approximator (Hornik et al., 1989), but using multiple layers may lead to simpler networks Multiple Hidden Layers 23Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

24 Momentum Adaptive learning rate Improving Convergence 24Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

25 Overfitting/Overtraining 25 Number of weights: H (d+1)+(H+1)K Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

26 26Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

27 Structured MLP 27 (Le Cun et al, 1989) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

28 Weight Sharing 28Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

29 Hints 29 Invariance to translation, rotation, size Virtual examples Augmented error: E’=E+λ h E h If x’ and x are the “same”: E h =[g(x|θ)- g(x’|θ)] 2 Approximation hint: (Abu-Mostafa, 1995) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

30 Tuning the Network Size Destructive Weight decay: Constructive Growing networks 30 (Ash, 1989) (Fahlman and Lebiere, 1989) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

31 Consider weights w i as random vars, prior p(w i ) Weight decay, ridge regression, regularization cost=data-misfit + λ complexity More about Bayesian methods in chapter 14 Bayesian Learning 31Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

32 Dimensionality Reduction 32Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

33 33Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

34 Learning Time Applications: Sequence recognition: Speech recognition Sequence reproduction: Time-series prediction Sequence association Network architectures Time-delay networks (Waibel et al., 1989) Recurrent networks (Rumelhart et al., 1986) 34Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

35 Time-Delay Neural Networks 35Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

36 Recurrent Networks 36Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

37 Unfolding in Time 37Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)


Download ppt "ETHEM ALPAYDIN © The MIT Press, 2010 Lecture Slides for."

Similar presentations


Ads by Google