CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

Slides:



Advertisements
Similar presentations
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Advertisements

also known as the “Perceptron”
Single Neuron Hung-yi Lee. Single Neuron … bias Activation function.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Logistic Regression.
Machine Learning Week 2 Lecture 1.
Machine Learning Week 1, Lecture 2. Recap Supervised Learning Data Set Learning Algorithm Hypothesis h h(x) ≈ f(x) Unknown Target f Hypothesis Set 5 0.
Logistic Regression Classification Machine Learning.
Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
x – independent variable (input)
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202: The Perceptron Algorithm Greg Grudic.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Logistic Regression 10701/15781 Recitation February 5, 2008 Parts of the slides are from previous years’ recitation and lecture notes, and from Prof. Andrew.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Collaborative Filtering Matrix Factorization Approach
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Outline Classification Linear classifiers Perceptron Multi-class classification Generative approach Naïve Bayes classifier 2.
Logistic Regression L1, L2 Norm Summary and addition to Andrew Ng’s lectures on machine learning.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 2B *Courtesy of Associate Professor Andrew Ng’s Notes, Stanford University.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.
Model representation Linear regression with one variable
Andrew Ng Linear regression with one variable Model representation Machine Learning.
Nov 23rd, 2001Copyright © 2001, 2003, Andrew W. Moore Linear Document Classifier.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Logistic Regression Week 3 – Soft Computing By Yosi Kristian.
Week 1 - An Introduction to Machine Learning & Soft Computing
The problem of overfitting
Regularization (Additional)
Logistic Regression (Classification Algorithm)
M Machine Learning F# and Accord.net.
Introduction Welcome Machine Learning.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
CHAPTER 11 R EINFORCEMENT L EARNING VIA T EMPORAL D IFFERENCES Organization of chapter in ISSO –Introduction –Delayed reinforcement –Basic temporal difference.
Machine Learning in CSC 196K
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Support Vector Machines Optimization objective Machine Learning.
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
First Lecture of Machine Learning Hung-yi Lee. Learning to say “yes/no” Binary Classification.
Machine Learning for Computer Security
Machine Learning & Deep Learning
Lecture 3: Linear Regression (with One Variable)
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Logistic Regression Classification Machine Learning.
Logistic Regression Classification Machine Learning.
Machine Learning Today: Reading: Maria Florina Balcan
Collaborative Filtering Matrix Factorization Approach
Logistic Regression Classification Machine Learning.
Support Vector Machine I
Machine Learning Algorithms – An Overview
Logistic Regression Chapter 7.
Linear Discrimination
Logistic Regression Classification Machine Learning.
Logistic Regression Classification Machine Learning.
Linear regression with one variable
Logistic Regression Classification Machine Learning.
Logistic Regression Geoff Hulten.
CHAPTER 11 REINFORCEMENT LEARNING VIA TEMPORAL DIFFERENCES
Presentation transcript:

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: 6: Logistic Regression 1 CSC M.A. Papalaskari - Villanova University T he slides in this presentation are adapted from: Andrew Ng’s ML course

Machine learning problems Supervised Learning – Classification – Regression Unsupervised learning Others: Reinforcement learning, recommender systems. Also talk about: Practical advice for applying learning algorithms. CSC M.A. Papalaskari - Villanova University 2

Machine learning problems Supervised Learning – Classification – Regression Unsupervised learning Others: Reinforcement learning, recommender systems. Also talk about: Practical advice for applying learning algorithms. CSC M.A. Papalaskari - Villanova University 3

Classification Spam / Not Spam? Online Transactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor)

Tumor Size Threshold classifier output at 0.5: If, predict “y = 1” If, predict “y = 0” Tumor Size Malignant ? (Yes) 1 (No) 0

Classification: y = 0 or 1 can be > 1 or < 0 Logistic Regression:

New model: Use Sigmoid function (or Logistic function) Logistic Regression Model Want Old regression model:

Interpretation of Hypothesis Output = estimated probability that y = 1 on input x Tell patient that 70% chance of tumor being malignant Example: If “probability that y = 1, given x, parameterized by ”

Logistic regression Suppose predict “ ” if predict “ “ if z 1

x1x1 x2x2 Decision Boundary Predict “ “ if

Non-linear decision boundaries x1x1 x2x2 Predict “ “ if 1 1

Training set: How to choose parameters ? m examples

Cost function Linear regression:

Logistic regression cost function If y = 1 1 0

Logistic regression cost function If y = 0 1 0

Logistic regression cost function

Logistic regression cost function – more compact expression

Output Logistic regression cost function – more compact expression To fit parameters : To make a prediction given new :

Gradient Descent Want : Repeat (simultaneously update all )

Gradient Descent Want : (simultaneously update all ) Repeat Algorithm looks identical to linear regression!

Optimization algorithm Given, we have code that can compute - (for ) Optimization algorithms: -Gradient descent -Conjugate gradient -BFGS -L-BFGS BFGS= Broyden Fletcher Goldfarb Shanno Advantages: -No need to manually pick -Often faster than gradient descent. Disadvantages: -More complex

Multiclass classification foldering/tagging: Work, Friends, Family, Hobby Medical diagrams: Not ill, Cold, Flu Weather: Sunny, Cloudy, Rain, Snow

x1x1 x2x2 x1x1 x2x2 Binary classification: Multi-class classification:

x1x1 x2x2 One-vs-all (one-vs-rest): Class 1: Class 2: Class 3: x1x1 x2x2 x1x1 x2x2 x1x1 x2x2

One-vs-all Train a logistic regression classifier for each class to predict the probability that. On a new input, to make a prediction, pick the class that maximizes