Wed June 12 Goals of today’s lecture. Learning Mechanisms

Slides:



Advertisements
Similar presentations
Introduction to Neural Networks Computing
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
NEURAL NETWORKS Perceptron
Artificial Neural Networks
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
Lecture 11 Neural Networks L. Manevitz Dept of Computer Science Tel: 2420.
Reading for Next Week Textbook, Section 9, pp A User’s Guide to Support Vector Machines (linked from course website)
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Overview over different methods – Supervised Learning
NNs Adaline 1 Neural Networks - Adaline L. Manevitz.
Learning via Neural Networks L. Manevitz All rights reserved.
Neural Network Computing Lecture no.1. All rights reserved L. Manevitz Lecture 12 McCullogh-Pitts Neuron The activation of a McCullogh-Pitts Neuron is.
Artificial Neural Networks
Artificial Neural Networks
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
CS464 Introduction to Machine Learning1 Artificial N eural N etworks Artificial neural networks (ANNs) provide a general, practical method for learning.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
L. Manevitz U. Haifa 1 Neural Networks: Capabilities and Examples L. Manevitz Computer Science Department HIACS Research Center University of Haifa.
Chapter 2 Single Layer Feedforward Networks
Perceptrons Michael J. Watts
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Introduction to the Basic Principles of Machine Learning
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Neural networks and support vector machines
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Deep Feedforward Networks
Artificial Neural Networks
Dan Roth Department of Computer and Information Science
Neural Networks Winter-Spring 2014
Learning with Perceptrons and Neural Networks
Chapter 2 Single Layer Feedforward Networks
Artificial neural networks:
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Artificial Neural Networks
An Introduction to Support Vector Machines
CS621: Artificial Intelligence
Data Mining with Neural Networks (HK: Chapter 7.5)
Disadvantages of Discrete Neurons
Classification Neural Networks 1
CS623: Introduction to Computing with Neural Nets (lecture-2)
Chapter 3. Artificial Neural Networks - Introduction -
Artificial Intelligence Chapter 3 Neural Networks
Perceptron as one Type of Linear Discriminants
Learning via Neural Networks
Neural Networks Chapter 5
Neural Network - 2 Mayank Vatsa
Perceptrons Introduced in1957 by Rosenblatt
Lecture Notes for Chapter 4 Artificial Neural Networks
Neural Networks ICS 273A UC Irvine Instructor: Max Welling
Artificial Intelligence Chapter 3 Neural Networks
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Chapter - 3 Single Layer Percetron
Artificial Intelligence Chapter 3 Neural Networks
Seminar on Machine Learning Rada Mihalcea
The McCullough-Pitts Neuron
David Kauchak CS158 – Spring 2019
Perceptron Learning Rule
A task of induction to find patterns
Perceptron Learning Rule
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Wed June 12 Goals of today’s lecture. Learning Mechanisms Where is AI and where is it going? What to look for in the future? Status of Turing test? Material and guidance for exam. Discuss any outstanding problems on last assignment.

Automated Learning Techniques ID3 : A technique for automatically developing a good decision tree based on given classification of examples and counter-examples.

Automated Learning Techniques Algorithm W (Winston): an algorithm that develops a “concept” based on examples and counter-examples.

Automated Learning Techniques Perceptron: an algorithm that develops a classification based on examples and counter-examples. Non-linearly separable techniques (neural networks, support vector machines).

Learning in Neural Networks Perceptrons

Natural versus Artificial Neuron Natural Neuron McCullough Pitts Neuron

One Neuron McCullough-Pitts This is very complicated. But abstracting the details,we have S w1 w2 wn x1 x2 xn Threshold Integrate Integrate-and-fire Neuron

Perceptron weights A Pattern Identification (Note: Neuron is trained)

Three Main Issues Representability Learnability Generalizability

One Neuron (Perceptron) What can be represented by one neuron? Is there an automatic way to learn a function by examples?

Feed Forward Network weights weights A

Representability What functions can be represented by a network of McCullough-Pitts neurons? Theorem: Every logic function of an arbitrary number of variables can be represented by a three level network of neurons.

Proof Show simple functions: and, or, not, implies Recall representability of logic functions by DNF form.

Perceptron What is representable? Linearly Separable Sets. Example: AND, OR function Not representable: XOR High Dimensions: How to tell? Question: Convex? Connected?

AND

OR

XOR

Convexity: Representable by simple extension of perceptron Clue: A body is convex if whenever you have two points inside; any third point between them is inside. So just take perceptron where you have an input for each triple of points

Connectedness: Not Representable

Representability Perceptron: Only Linearly Separable AND versus XOR Convex versus Connected Many linked neurons: universal Proof: Show And, Or , Not, Representable Then apply DNF representation theorem

Learnability Perceptron Convergence Theorem: If representable, then perceptron algorithm converges Proof (from slides) Multi-Neurons Networks: Good heuristic learning techniques

Generalizability Typically train a perceptron on a sample set of examples and counter-examples Use it on general class Training can be slow; but execution is fast. Main question: How does training on training set carry over to general class? (Not simple)

Programming: Just find the weights! AUTOMATIC PROGRAMMING (or learning) One Neuron: Perceptron or Adaline Multi-Level: Gradient Descent on Continuous Neuron (Sigmoid instead of step function).

Perceptron Convergence Theorem If there exists a perceptron then the perceptron learning algorithm will find it in finite time. That is IF there is a set of weights and threshold which correctly classifies a class of examples and counter-examples then one such set of weights can be found by the algorithm.

Perceptron Training Rule Loop: Take an positive example or negative example. Apply to network. If correct answer, Go to loop. If incorrect, Go to FIX. FIX: Adjust network weights by input example If positive example Wnew = Wold + X; increase threshold If negative example Wnew = Wold - X; decrease threshold Go to Loop.

Perceptron Conv Theorem (again) Preliminary: Note we can simplify proof without loss of generality use only positive examples (replace example X by –X) assume threshold is 0 (go up in dimension by encoding X by (X, 1).

Perceptron Training Rule (simplified) Loop: Take a positive example. Apply to network. If correct answer, Go to loop. If incorrect, Go to FIX. FIX: Adjust network weights by input example If positive example Wnew = Wold + X Go to Loop.

Proof of Conv Theorem Note: 1. By hypothesis, there is a e >0 such that V*X >e for all x in F 1. Can eliminate threshold (add additional dimension to input) W(x,y,z) > threshold if and only if W* (x,y,z,1) > 0 2. Can assume all examples are positive ones (Replace negative examples by their negated vectors) W(x,y,z) <0 if and only if W(-x,-y,-z) > 0.

Perceptron Conv. Thm.(ready for proof) Let F be a set of unit length vectors. If there is a (unit) vector V* and a value e>0 such that V*X > e for all X in F then the perceptron program goes to FIX only a finite number of times (regardless of the order of choice of vectors X). Note: If F is finite set, then automatically there is such an e.

Proof (cont). Consider quotient V*W/|V*||W|. (note: this is cosine between V* and W.) Recall V* is unit vector . = V*W*/|W| Quotient <= 1.

Proof(cont) Consider the numerator Now each time FIX is visited W changes via ADD. V* W(n+1) = V*(W(n) + X) = V* W(n) + V*X > V* W(n) + e Hence after n iterations: V* W(n) > n e (*)

Proof (cont) Now consider denominator: |W(n+1)|2 = W(n+1)W(n+1) = ( W(n) + X)(W(n) + X) = |W(n)|**2 + 2W(n)X + 1 (recall |X| = 1) < |W(n)|**2 + 1 (in Fix because W(n)X < 0) So after n times |W(n+1)|2 < n (**)

Proof (cont) Putting (*) and (**) together: Quotient = V*W/|W| > ne/ sqrt(n) = sqrt(n) e. Since Quotient <=1 this means n < 1/e2. This means we enter FIX a bounded number of times. Q.E.D.

Geometric Proof See hand slides.

Additional Facts Note: If X’s presented in systematic way, then solution W always found. Note: Not necessarily same as V* Note: If F not finite, may not obtain solution in finite time Can modify algorithm in minor ways and stays valid (e.g. not unit but bounded examples); changes in W(n).

Percentage of Boolean Functions Representable by a Perceptron Input Perceptrons Functions 1 4 4 2 16 14 3 104 256 4 1,882 65,536 5 94,572 10**9 6 15,028,134 10**19 7 8,378,070,864 10**38 8 17,561,539,552,946 10**77

What wont work? Example: Connectedness with bounded diameter perceptron. Compare with Convex with (use sensors of order three).

What wont work? Try XOR.

What about non-linear separable problems? Find “near separable solutions” Use transformation of data to space where they are separable (SVM approach) Use multi-level neurons

Multi-Level Neurons Difficulty to find global learning algorithm like perceptron But … It turns out that methods related to gradient descent on multi-parameter weights often give good results. This is what you see commercially now.

Applications Detectors (e. g. medical monitors) Noise filters (e.g. hearing aids) Future Predictors (e.g. stock markets; also adaptive pde solvers) Learn to steer a car! Many, many others …