Follow-ups to HMMs Graphical Models Semi-supervised learning CISC 5800 Professor Daniel Leeds.

Slides:



Advertisements
Similar presentations
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Advertisements

Unsupervised Learning
Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited.
An Overview of Machine Learning
Supervised Learning Recap
PROBABILISTIC MODELS David Kauchak CS451 – Fall 2013.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Final review LING572 Fei Xia Week 10: 03/13/08 1.
The use of unlabeled data to improve supervised learning for text summarization MR Amini, P Gallinari (SIGIR 2002) Slides prepared by Jon Elsas for the.
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Bayesian Belief Networks
Co-training LING 572 Fei Xia 02/21/06. Overview Proposed by Blum and Mitchell (1998) Important work: –(Nigam and Ghani, 2000) –(Goldman and Zhou, 2000)
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
. Learning Parameters of Hidden Markov Models Prepared by Dan Geiger.
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Bing LiuCS Department, UIC1 Chapter 8: Semi-Supervised Learning Also called “partially supervised learning”
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.
The classification problem (Recap from LING570) LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/08 1.
Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.
August 16, 2015EECS, OSU1 Learning with Ambiguously Labeled Training Data Kshitij Judah Ph.D. student Advisor: Prof. Alan Fern Qualifier Oral Presentation.
Semi-Supervised Learning
Crash Course on Machine Learning
11 CS 388: Natural Language Processing: Discriminative Training and Conditional Random Fields (CRFs) for Sequence Labeling Raymond J. Mooney University.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
More Machine Learning Linear Regression Squared Error L1 and L2 Regularization Gradient Descent.
Final review LING572 Fei Xia Week 10: 03/11/
Active Learning for Class Imbalance Problem
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Text Classification, Active/Interactive learning.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Active Learning An example From Xu et al., “Training SpamAssassin with Active Semi- Supervised Learning”
1 CS546: Machine Learning and Natural Language Discriminative vs Generative Classifiers This lecture is based on (Ng & Jordan, 02) paper and some slides.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
1 COMP3503 Semi-Supervised Learning COMP3503 Semi-Supervised Learning Daniel L. Silver.
Naive Bayes Classifier Christopher Gonzalez. Outline Bayes’ Theorem What is a Naive Bayes Classifier (NBC)? Why/when to use NBC? How does NBC work? Applications.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
NLP. Introduction to NLP Sequence of random variables that aren’t independent Examples –weather reports –text.
Machine Learning CISC 5800 Dr Daniel Leeds. What is machine learning Finding patterns in data Adapting program behavior Advertise a customer’s favorite.
Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Lecture 2: Statistical learning primer for biologists
1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
Logistic Regression William Cohen.
Hidden Markov Models CISC 5800 Professor Daniel Leeds.
Naive Bayes (Generative Classifier) vs. Logistic Regression (Discriminative Classifier) Minkyoung Kim.
Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
A Brief Introduction to Bayesian networks
Data Mining Practical Machine Learning Tools and Techniques
Semi-Supervised Clustering
New Machine Learning in Medical Imaging Journal Club
COMP61011 : Machine Learning Ensemble Models
Overview of Supervised Learning
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Read R&N Ch Next lecture: Read R&N
Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Lecture 6: Introduction to Machine Learning
Unifying Variational and GBP Learning Parameters of MNs EM for BNs
Basics of ML Rohan Suri.
Recap: Naïve Bayes classifier
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]
Presentation transcript:

Follow-ups to HMMs Graphical Models Semi-supervised learning CISC 5800 Professor Daniel Leeds

Approaches to learning/classification letter 1 P(letter 1 | word=“duck”) “a”0.001 “b”0.001 “c”0.001 “d”

Joint probability over N features Problem with learning table with N features: If all dependent, exponential number of model parameters Naïve Bayes – all independent Linear number of model parameters What if not all features are independent? Burglar breaks in Alarm goes off Jill gets a call Zack gets a call P(A,J,Z|B) YYYY0.3 YYYN0.03 YYNY YYNN0.06 3

Bayes nets: conditional independence E A JZ B B – Burglar E – Earthquake A – Alarm goes off J – Jill is called Z – Zack is called 4

Example evaluation of Bayes nets E A JZ B B – Burglar E – Earthquake A – Alarm goes off J – Jill is called Z – Zack is called 5

Conditional independence E A JZ B B – Burglar E – Earthquake A – Alarm goes off J – Jill is called Z – Zack is called 6

HMM: a kind-of example of Bayes nets P(q 1,q 2,q 3,o 1,o 2,o 3 ) = q1q1 q2q2 q3q3 o1o1 o2o2 o3o3 7

Back to Expectation-Maximization 8

Types of learning Supervised: each training data point has known features and class label Most examples so far Unsupervised: each training data point has known features, but no class label ICA – each component meant to describe subset of data points Semi-supervised: each train data point has known features, but only some have class labels Related to expectation maximization 9

Document classification example Two classes: {farm, zoo} 5 labeled zoo articles, 5 labeled jungle articles 100 unlabeled training articles Features: [% bat, % elephant, % monkey, % snake, % lion, %penguin] E.g., % bat i = #{wordsInArticle i ==bat}/#{wordsInArticle i } Logistic regression classifier 10

Iterative learning Learn w with labeled training data Use classifier to assign labels to originally unlabeled training data Learn w with known and newly-assigned labels Use classifier to re-assign labels to originally unlabeled training data Converges to a stable answer 11