Regression.

Slides:



Advertisements
Similar presentations
Slides from: Doug Gray, David Poole
Advertisements

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Kostas Kontogiannis E&CE
Machine Learning Neural Networks
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Chapter 6: Multilayer Neural Networks
CS 484 – Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Artificial Neural Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Classification / Regression Neural Networks 2
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Neural networks.
Neural networks and support vector machines
Regression.
CS 388: Natural Language Processing: Neural Networks
Artificial Neural Networks
Artificial neural networks
Learning with Perceptrons and Neural Networks
Learning in Neural Networks
Chapter 6: Multilayer Neural Networks (Sections , 6.8)
Real Neurons Cell structures Cell body Dendrites Axon
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
LECTURE 28: NEURAL NETWORKS
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Artificial Neural Networks
Classification / Regression Neural Networks 2
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
Data Mining with Neural Networks (HK: Chapter 7.5)
General Aspects of Learning
Synaptic DynamicsII : Supervised Learning
Neuro-Computing Lecture 4 Radial Basis Function Network
of the Artificial Neural Networks.
Artificial Neural Networks
Neural Network - 2 Mayank Vatsa
Neural Networks Geoff Hulten.
Artificial Intelligence Lecture No. 28
Lecture Notes for Chapter 4 Artificial Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 28: NEURAL NETWORKS
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
Artificial neurons Nisheeth 10th January 2019.
COSC 4335: Part2: Other Classification Techniques
Seminar on Machine Learning Rada Mihalcea
David Kauchak CS158 – Spring 2019
General Aspects of Learning
Presentation transcript:

Regression, Artificial Neural Networks 07/03/2017

Regression

Regression Supervised learning: Based on training examples, learn a modell which works fine on previously unseen examples. Regression: forecasting real values

Regression Size (sm) District Age (years) Price (M Ft) 60 Csillag-tér 32 8,3 120 Alsóváros 21 26,8 35 Tarján 38 5,5 70 Belváros ???

Regression Training dataset: {xi, ri} riϵR Evaluation metric: „Least squared error”

Linear regression

Linear regression g(x) = w1x + w0 Its gradient is 0 if

Regression variants Decision tree Internal nodes are the same Leaves contains a constant or various linear models

Regression SVM

Artificial Neural Networks

Artificial neural networks Motivation: the simulation of the neuo system (human brain)’s information processing mechanisms Structure: huge amount of densely connected, mutally operating processing units (neurons) It learns from experiences (training instances)

Some neurobiology… Neurons have many inputs and a single output The output is either excited or not The inputs from other neurons determins whether the neuron fires Each input synapse has a weight Inputs: dentrites Processing: soma Outputs: axons Synapses: electrochemical contact between neurons Basically, a biological neuron receives inputs from other sources, combines them in some way, performs a generally nonlinear operation on the result, and then output the final result.

A neuron in maths Weighted average of inputs. If the average is above a threshold T it fires (outputs 1) else its output is 0 or -1. The basic unit of neural networks, the artificial neurons, simulates the four basic functions of natural neurons. Artificial neurons are much simpler than the biological neuron; the figure below shows the basics of an artificial neuron.

Statistics about the human brain #nerons: ~ 1011 Avg. #connections per neuron: 104 Signal sending time: 10-3 sec Face recognition: 10-1 sec

Motivation (machine learning point of view) Goal: non-linear classification Linear machines are not satisfactory at several real world situations Which non-linear function family to choose? Neural networks: latent non-linear patterns will be machine learnt

Perceptron

Multilayer perceptron = Neural Network Different representation at various layers Biologically, neural networks are constructed in a three dimensional way from microscopic components. These neurons seem capable of nearly unrestricted interconnections. This is not true in any man-made network. Artificial neural networks are the simple clustering of the primitive artificial neurons. This clustering occurs by creating layers, which are then connected to one another. How these layers connect may also vary. Basically, all artificial neural networks have a similar structure of topology. Some of the neurons interface the real world to receive its inputs and other neurons provide the real world with the network’s outputs. All the rest of the neurons are hidden form view. As the figure above shows, the neurons are grouped into layers The input layer consist of neurons that receive input form the external environment. The output layer consists of neurons that communicate the output of the system to the user or external environment. There are usually a number of hidden layers between these two layers; the figure above shows a simple structure with only one hidden layer. When the input layer receives the input its neurons produce output, which becomes input to the other layers of the system. The process continues until a certain condition is satisfied or until the output layer is invoked and fires their output to the external environment. Inter-layer connections There are different types of connections used between layers, these connections between layers are called inter-layer connections. Fully connected Each neuron on the first layer is connected to every neuron on the second layer. Partially connected A neuron of the first layer does not have to be connected to all neurons on the second layer. Feed forward The neurons on the first layer send their output to the neurons on the second layer, but they do not receive any input back form the neurons on the second layer. Bi-directional There is another set of connections carrying the output of the neurons of the second layer into the neurons of the first layer. Feed forward and bi-directional connections could be fully- or partially connected. Hierarchical If a neural network has a hierarchical structure, the neurons of a lower layer may only communicate with neurons on the next level of layer. Resonance The layers have bi-directional connections, and they can continue sending messages across the connections a number of times until a certain condition is achieved.

Multilayer perceptron

Feedforward neural networks Connection only to the next layer The weights of the connections (between two layers) can be changed Activation functions are used to calculate whether the neuron fires Three-layer network: Input layer Hidden layer Output layer

Network function The network function of neuron j: where i is the index of input neurons, and wji is the weight between the neurons i and j. wj0 is the bias

Activation function activation function is a non-linear function of the network value: yj = f(netj) (if it’d be linear, the whole network will be linear) The sign activation function: oi 1 Tj netj

Differentiable activation functions Enables gradient descent-based learning The sigmoid function: 1 Tj netj

Output layer where k is the index on the output layer and nH is the number of hidden neurons Binary classification: sign function Multi-class classification: a neuron for each of the classes, the argmax is predicted (discriminant function) Regression: linear transformation

y1 hidden unit calculates: x1 + x2 + 0.5 x1 OR x2 < 0  y1 = -1 - y2 represents:  0  y2 = +1 x1 + x2 -1.5 x1 AND x2 < 0  y2 = -1 The output neuron: z1 = 0.7y1-0.4y2 - 1, sgn(z1) is 1 iff y1 =1, y2 = -1 (x1 OR x2 ) AND NOT(x1 AND x2)

General (three-layer) feedforward network (c output unit) The hidden units with their activation functions can express non-linear functions The activation functions can be different at neurons (but the same one is used in practice)

Universal approximation theorem Universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons can approximate any continuous functions But the theorem does not give any hint on who to design activation functions for problems/datasets

Training of neural networks (backpropagation)

Training of neural networks The network topology is given The same activation function is used at each hidden neuron and it is given Training = calibration of weights on-line learning (epochs) The brain basically learns from experience. Neural networks are sometimes called machine learning algorithms, because changing of its connection weights (training) causes the network to learn the solution to a problem. The strength of connection between the neurons is stored as a weight-value for the specific connection. The system learns new knowledge by adjusting these connection weights. The learning ability of a neural network is determined by its architecture and by the algorithmic method chosen for training.

Training of neural networks Forward propagation An input vector propagates through the network 2. Weight update (backpropagation) the weights of the network will be changed in order to decrease the difference between the predicted and gold standard values

Training of neural networks we can calculate (propagate back) the error signal for each hidden neuron

tk is the target (gold standard) value of output neuron k, zk is the prediction at output neuron k (k = 1, …, c) and w are the weights Error: backpropagation is a gradient descent algorithms initial weights are random, then

Backpropagation The error of the weights between the hidden and output layers: the error signal for output neuron k:

because netk = wkty: and: The change of weights between the hidden and output layers: wkj = kyj = (tk – zk) f’ (netk)yj

The gradient of the hidden units:

The error signal of the hidden units: The weight change between the input and hidden layers:

update the weights to k: Backpropagation Calculate the error signal for the output neurons and update the weights between the output and hidden layers output update the weights to k: hidden input

Backpropagation Calculate the error signal for hidden neurons output rejtett input

Backpropagation Update the weights between the input and hidden neurons output rejtett updating the ones to j input

Training of neural networks w initialised randomly Begin init: nH; w, stopping critera , , m  0 do m  m + 1 xm  a sampled training instance wji  wji + jxi; wkj  wkj + kyj until ||J(w)|| <  return w End

Stopping based on the performance on a validation dataset The usage of unseen training instances for estimating the performance of supervised learning (to avoid overfitting) Stopping at the minimum error on the validation set

Notes on backpropagation it can be stack at local minima In practice, the local minima is close to the global one Multiple training starting from various randomly initalized weights might help we can take the trained network with the minimal error (on a validation set) there are voting schema for voting the networks

Questions of network design How many hidden neurons? few neurons cannot learn complex patterns too many neurons can easily overfit validation set? Learning rate!?

Deep learning

History of neural networks Perceptron: one of the first machine learners ~1950 Backpropagation: multilayer perceptrons, 1975- Deep learning: popular again 2006-

Auto-encoder pretraining

Greedy layer-wise pretraining

Rectifier networks

Dropout

Block networks

Recurrent neural networks rövid távú memória http://www.youtube.com/watch?v=vmDByFN6eig