Single Neuron Hung-yi Lee. Single Neuron … bias Activation function.

Slides:



Advertisements
Similar presentations
Classification Classification Examples
Advertisements

Neural Networks: Backpropagation algorithm Data Mining and Semantic Web University of Belgrade School of Electrical Engineering Chair of Computer Engineering.
Salvatore giorgi Ece 8110 machine learning 5/12/2014
Logistic Regression Classification Machine Learning.
Decision making in episodic environments
Lecture 14 – Neural Networks
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Statistical Methods Chichang Jou Tamkang University.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
Machine learning Image source:
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Crash Course on Machine Learning
Machine learning Image source:
Lecture 2: Introduction to Machine Learning
Deep Learning Neural Network with Memory (1)
Explorations in Neural Networks Tianhui Cai Period 3.
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 2B *Courtesy of Associate Professor Andrew Ng’s Notes, Stanford University.
Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.
Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.
Classification: Feature Vectors
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.
Machine learning system design Prioritizing what to work on
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Logistic Regression Week 3 – Soft Computing By Yosi Kristian.
Linear Discrimination Reading: Chapter 2 of textbook.
Week 1 - An Introduction to Machine Learning & Soft Computing
Logistic Regression (Classification Algorithm)
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.
CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Page 1 CS 546 Machine Learning in NLP Review 1: Supervised Learning, Binary Classifiers Dan Roth Department of Computer Science University of Illinois.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Linear Algebra: What are we going to learn? 李宏毅 Hung-yi Lee.
First Lecture of Machine Learning Hung-yi Lee. Learning to say “yes/no” Binary Classification.
Today’s Lecture Neural networks Training
Artificial Intelligence, P.II
Dan Roth Department of Computer and Information Science
Artificial Intelligence
Other Classification Models: Neural Network
Classification: Logistic Regression
Perceptrons Lirong Xia.
Classification with Perceptrons Reading:
MACHINE LEARNING.
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Logistic Regression Classification Machine Learning.
CS 188: Artificial Intelligence
Logistic Regression Classification Machine Learning.
Decision making in episodic environments
network of simple neuron-like computing elements
Deep Learning for Non-Linear Control
Zip Codes and Neural Networks: Machine Learning for
Presented by Wanxue Dong
Machine Learning Algorithms – An Overview
CS 188: Artificial Intelligence
Logistic Regression Classification Machine Learning.
Logistic Regression Classification Machine Learning.
Perceptrons Lirong Xia.
Logistic Regression Classification Machine Learning.
Presentation transcript:

Single Neuron Hung-yi Lee

Single Neuron … bias Activation function

Learning to say “yes/no” Binary Classification

Learning to say yes/no Spam filtering Is an spam or not? Recommendation systems recommend the product to the customer or not? Malware detection Is the software malicious or not? Stock prediction Will the future value of a stock increase or not with respect to its current value? Binary Classification

Example Application: Spam filtering ( Spam Not spam

Example Application: Spam filtering How to estimate P(yes|x)?  What does the function f look like?

Example Application: Spam filtering To estimate P(yes|x), collect examples first Yes (Spam) No (Not Spam) Yes (Spam) ……. ….. Earn … free …… free Talk … Meeting … Win … free……  Some words frequently appear in the spam e.g., “free”  Use the frequency of “free” to decide if an is spam x1x1 x2x2 x3x3  Estimate P(yes|x free = k) x free is the number of “free” in x

Regression Frequency of “Free” (x free ) in an x p(yes|x free ) Problem: What if one day you receive an with 3 “free” …. In training data, there is no e- mail containing 3 “free”. p(yes| x free = 1 ) = 0.4 p(yes| x free = 0 ) = 0.1

Regression f(x free ) = wx free + b (f(x free ) is an estimate of p(yes|x free ) ) Store w and b Frequency of “Free” (x free ) in an x Regression p(yes|x free )

Regression Problem: What if one day you receive an with 6 “free” …. Frequency of “Free” (x free ) in an x p(yes|x free ) Regression f(x free ) = wx free + b The output of f is not between 0 and 1

Logit vertical line: Probability to be spam p(yes|x free ) (p) p is always between 0 and 1 vertical line: logit(p) x free

Logit vertical line: Probability to be spam p(yes|x free ) (p) p is always between 0 and 1 vertical line: logit(p) x free f ’ (x free ) = w ’ x free + b ’ (f ’ (x free ) is an estimate of logit(p) )

Logit > 0.5, so “yes” “yes” vertical line: logit(p) x free Store w ’ and b ’ f ’ (x free ) = w ’ x free + b ’ (f ’ (x free ) is an estimate of logit(p) ) (perceptron)

Multiple Variables x free Consider two words “free” and “hello” compute p(yes|x free,x hello ) (p) x hello

Multiple Variables Regression x free x hello Consider two words “free” and “hello” compute p(yes|x free,x hello ) (p)

Multiple Variables Of course, we can consider all words {t 1, t 2, … t N } in a dictionary z is to approximate logit(p)

approximate Logistic Regression If the probability p = 1 or 0, ln(p/1-p) = +infinity or –infinity Can not do regression t 1 appears 3 times t 2 appears 0 time … t N appears 1 time x  The probability to be spam p is always 1 or 0.

Logistic Regression Sigmoid Function

Logistic Regression Yes (Spam) x1x1 close to 1 No (not Spam) x2x2 close to 0

Logistic Regression … bias feature Yes 1 0 No

Logistic Regression … 1 0 Yes No

Reference for Linear Classifier classifiers.pdf

More than saying “yes/no” Multiclass Classification

More than saying “yes/no” Handwriting digit classification This is Multiclass Classification

More than saying “yes/no” Handwriting digit classification Simplify the question: whether an image is “2” or not … feature of an image Each pixel corresponds to one dimension in the feature Describe the characteristics of input object

More than saying “yes/no” Handwriting digit classification Simplify the question: whether an image is “2” or not … “2” or not

More than saying “yes/no” Handwriting digit classification Binary classification of 1, 2, 3 … … “1” or not “2” or not “3” or not …… If y 2 is the max, then the image is “2”.

Limitation of single neuron Motivation of cascading neurons

Limitation of Logistic Regression Input Output x1x1 x2x2 00No 01Yes 10 11No Yes No

Limitation of Logistic Regression XNOR gate AND OR AND NOT Input Output x1x1 x2x2 00No 01Yes 10 11No

Neural Network XNOR gate AND OR AND NOT Hidden Neurons Neural Network

a 2 =0.27 a 2 =0.73 a 2 =0.27 a 2 =0.05 a 1 =0.27 a 1 =0.05 a 1 =0.73

b (0.27, 0.27) (0.73, 0.05) (0.05,0.73) a 2 =0.27 a 2 =0.73 a 2 =0.27 a 2 =0.05 a 1 =0.27 a 1 =0.05 a 1 =0.73 a1a1 a2a2 a1a1 a2a2

Thank you for your listening!

Appendix

More reference e/2_GD_REG_pton_NN/lecture_notes/logistic_regr ession_loss_function/logistic_regression_loss.pdf e/2_GD_REG_pton_NN/lecture_notes/logistic_regr ession_loss_function/logistic_regression_loss.pdf error-function-minimized-in.html error-function-minimized-in.html nonconvex.pdf nonconvex.pdf lms.pdf lms.pdf

Logistic Regression x