Last lecture summary Naïve Bayes Classifier. Bayes Rule Normalization Constant LikelihoodPrior Posterior Prior and likelihood must be learnt (i.e. estimated.

Slides:



Advertisements
Similar presentations
Introduction to Neural Networks Computing
Advertisements

Artificial Neural Networks (1)
Perceptron Learning Rule
Neural Network I Week 7 1. Team Homework Assignment #9 Read pp. 327 – 334 and the Week 7 slide. Design a neural network for XOR (Exclusive OR) Explore.
What we will cover here What is a classifier
G5BAIM Artificial Intelligence Methods Graham Kendall Neural Networks.
Naïve Bayes Classifier
Naïve Bayes Classifier
On Discriminative vs. Generative classifiers: Naïve Bayes
Artificial Neural Networks - Introduction -
Simple Neural Nets For Pattern Classification
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
The McCulloch-Pitts Neuron. Characteristics The activation of a McCulloch Pitts neuron is binary. Neurons are connected by directed weighted paths. A.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Artificial Neural Networks
Naïve Bayes Classifier Ke Chen Extended by Longin Jan Latecki COMP20411 Machine Learning.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Naïve Bayes Classifier Ke Chen Modified and extended by Longin Jan Latecki
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Last lecture summary Naïve Bayes Classifier. Bayes Rule Normalization Constant LikelihoodPrior Posterior Prior and likelihood must be learnt (i.e. estimated.
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University EE459 Neural Networks The Structure.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Introduction to Artificial Intelligence (G51IAI) Dr Rong Qu Neural Networks.
Naïve Bayes Classifier Ke Chen Modified and extended by Longin Jan Latecki
Last lecture summary. biologically motivated synapses Neuron accumulates (Σ) positive/negative stimuli from other neurons. Then Σ is processed further.
Last lecture summary (SVM). Support Vector Machine Supervised algorithm Works both as – classifier (binary) – regressor De facto better linear classification.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen.
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Artificial Intelligence Methods Neural Networks Lecture 1 Rakesh K. Bissoondeeal Rakesh K.
Perceptrons Michael J. Watts
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Learning with Perceptrons and Neural Networks
Naïve Bayes Classifier
text processing And naïve bayes
Naïve Bayes Classifier
Artificial neural networks:
Classification with Perceptrons Reading:
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Naïve Bayes Classifier
CSE (c) S. Tanimoto, 2004 Neural Networks
Perceptron as one Type of Linear Discriminants
Naïve Bayes Classifier
Generative Models and Naïve Bayes
G5AIAI Introduction to AI
Artificial Intelligence Lecture No. 28
CSE (c) S. Tanimoto, 2001 Neural Networks
The Naïve Bayes (NB) Classifier
CSE (c) S. Tanimoto, 2002 Neural Networks
Artificial Intelligence 9. Perceptron
Generative Models and Naïve Bayes
Artificial Neural Networks
CSE (c) S. Tanimoto, 2007 Neural Nets
David Kauchak CS158 – Spring 2019
Introduction to Neural Networks

NAÏVE BAYES CLASSIFICATION
PYTHON Deep Learning Prof. Muhammad Saeed.
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Last lecture summary Naïve Bayes Classifier

Bayes Rule Normalization Constant LikelihoodPrior Posterior Prior and likelihood must be learnt (i.e. estimated from the data)

learning prior – A hundred independently drawn training examples will usually suffice to obtain a reasonable estimate of P(Y). larning likelihood – The Naïve Bayes Assumption: Assume that all features are independent given the class label Y.

Example – Play Tennis

Example – Learning Phase OutlookPlay=YesPlay=No Sunny 2/93/5 Overcast 4/90/5 Rain 3/92/5 TemperaturePlay=YesPlay=No Hot 2/92/5 Mild 4/92/5 Cool 3/91/5 HumidityPlay=YesPlay=No High 3/94/5 Normal 6/91/5 WindPlay=YesPlay=No Strong 3/93/5 Weak 6/92/5 P(Play=Yes) = 9/14P(Play=No) = 5/14 P(Outlook=Sunny|Play=Yes) = 2/9

Example - Prediction x’=(Outl=Sunny, Temp=Cool, Hum=High, Wind=Strong) Look up tables P(Outl=Sunny|Play=No) = 3/5 P(Temp=Cool|Play=No) = 1/5 P(Hum=High|Play=No) = 4/5 P(Wind=Strong|Play=No) = 3/5 P(Play=No) = 5/14 P(Outl=Sunny|Play=Yes) = 2/9 P(Temp=Cool|Play=Yes) = 3/9 P(Hum=High|Play=Yes) = 3/9 P(Wind=Strong|Play=Yes) = 3/9 P(Play=Yes) = 9/14 P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = Given the fact P(Yes| x ’) < P(No| x ’), we label x ’ to be “No”.

Last lecture summary Binary classifier performance

TP, TN, FP, FN Precision, Positive Predictive Value (PPV) TP / (TP + FP) Recall, Sensitivity, True Positive Rate (TPR), Hit rate TP / P = TP/(TP + FN) False Positive Rate (FPR), Fall-out FP / N = FP / (FP + TN) Specificity, True Negative Rate (TNR) TN / (TN + FP) = 1 - FPR Accuracy (TP + TN) / (TP + TN + FP + FN)

Neural networks (new stuff)

Biological motivation The human brain has been estimated to contain (~10 11 ) brain cells (neurons). A neuron is an electrically excitable cell that processes and transmits information by electrochemical signaling. Each neuron is connected with other neurons through the connections called synapses. A typical neuron possesses a cell body (often called soma), dendrites (many, mm), and an axon (one, 10 cm – 1 m).

Synapse permits a neuron to pass an electrical or chemical signal to another cell. Synapse can be either excitatory, or inhibitory. Synapses are of different strength (the stronger the synapse is, the more important it is). The effects of synapses cumulate inside the neuron. When the cumulative effect of synapses reaches certain threshold, the neuron gets activated, the signal is sent to the axon, through which the neuron is connected to other neuron(s).

Neural networks for applied science and engineering, Samarasinghe

Warren McCulloch Walter Pitts Threshold neuron

1 st mathematical model of neuron – McCulloch & Pitts binary (threshold) neuron – only binary inputs and output – the weights are pre-set, no learning x1x2t

x1x2t

Heavyside (threshold) activation function

Perceptron (1957) Frank Rosenblatt Developed the learning algorithm. Used his neuron (pattern recognizer = perceptron) for classification of letters.

Multiple output perceptron for multicategory (i.e. more than 2 classes) classification one output neuron for each class input layer output layer single layer (one-layered) vs. double layer (two-layered)

Learning

requirements for the minimum Gradient grad is a vector pointing in the direction of the greatest rate of increase of the function We want to decline, we take -grad.

Delta rule

error gradient

To find a gradient, differentiate the error E with respect to w 1 : According to the delta rule, weight change is proportional to the negative of the error gradient: New weight:

β is called a learning rate. It determines how far along the gradient it is necessary to move.

the new weight after i th iteration

This is an iterative algorithm, one pass through training set is not enough. One pass of the whole training data set is called an epoch. Adjusting the weights after each input pattern presentation (iteration) is called example-by- example (online) learning. – For some problems this can cause weights to oscillate – adjustment required by one pattern may be canceled by the next pattern. – More popular is the next method.

Batch learning – wait until all input patterns (i.e. epoch) have been processed and then adjust weights in the average sense. – More stable solution. – Obtain the error gradient for each input pattern – Average them at the end of the epoch – Use this average value to adjust the weights using the delta rule

Perceptron failure Please, help me and draw on the blackboard following functions: – AND, OR, XOR (eXclusive OR, true when exactly one of the operands is true, otherwise false) AND ORXOR ???

Perceptron uses linear activation function, so only linearly separable problems can be solved – famous book “Perceptrons” by Marvin Minsky and Seymour Papert showed that it was impossible for these classes of network to learn an XOR function. They conjectured (incorrectly !) that a similar result would hold for a perceptron with three or more layers. The often-cited Minsky/Papert text caused a significant decline in interest and funding of neural network research. It took ten more years until neural network research experienced a resurgence in the 1980s.

Play with