Rutgers CS440, Fall 2003 Review session. Rutgers CS440, Fall 2003 Topics Final will cover the following topics (after midterm): 1.Uncertainty & introduction.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington.
Dynamic Bayesian Networks (DBNs)
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Visual Recognition Tutorial
… Hidden Markov Models Markov assumption: Transition model:
Midterm Review. The Midterm Everything we have talked about so far Stuff from HW I won’t ask you to do as complicated calculations as the HW Don’t need.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Lecture 5: Learning models using EM
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
An Illustrative Example
Sample Midterm question. Sue want to build a model to predict movie ratings. She has a matrix of data, where for M movies and U users she has collected.
1er. Escuela Red ProTIC - Tandil, de Abril, Bayesian Learning 5.1 Introduction –Bayesian learning algorithms calculate explicit probabilities.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Computer vision: models, learning and inference
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
CS 561, Sessions 28 1 Uncertainty Probability Syntax Semantics Inference rules.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. Concept Learning Reference : Ch2 in Mitchell’s book 1. Concepts: Inductive learning hypothesis General-to-specific.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning Lecture 11 Summary G53MLE | Machine Learning | Dr Guoping Qiu1.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Maximum Entropy (ME) Maximum Entropy Markov Model (MEMM) Conditional Random Field (CRF)
Non-Bayes classifiers. Linear discriminants, neural networks.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
CS Statistical Machine learning Lecture 24
Slides for “Data Mining” by I. H. Witten and E. Frank.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Lecture 2: Statistical learning primer for biologists
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Behavior Recognition Based on Machine Learning Algorithms for a Wireless Canine Machine Interface Students: Avichay Ben Naim Lucie Levy 14 May, 2014 Ort.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Usman Roshan Dept. of Computer Science NJIT
Today.
Ch3: Model Building through Regression
Statistical Models for Automatic Speech Recognition
Data Mining Lecture 11.
Hidden Markov Models Part 2: Algorithms
Graduate School of Information Sciences, Tohoku University
Parametric Methods Berlin Chen, 2005 References:
Markov Networks.
A task of induction to find patterns
A task of induction to find patterns
Chapter 14 February 26, 2004.
Presentation transcript:

Rutgers CS440, Fall 2003 Review session

Rutgers CS440, Fall 2003 Topics Final will cover the following topics (after midterm): 1.Uncertainty & introduction to probability 2.Bayesian networks 3.Hidden Markov models & Kalman filters 4.Dynamic Bayesian networks 5.Decision making under uncertainty (static) 6.Markov decision processes 7.Decision trees 8.Statistical learning in BNs 9.Learning with incomplete data (EM) 10.Neural networks 11.( Support vector machines ) 12.( Reinforcement learning )

Rutgers CS440, Fall 2003 Uncertainty & probability Random variables (discrete & continuous) Joint, marginal, prior, conditional probabilities Bayes’ rule Independence & conditional independence

Rutgers CS440, Fall 2003 Bayesian networks Representation of joint probability distributions & densities Dependency / independency Markov blanket Bayes ball rules Inference in BNs –Enumeration –Variable elimination –Sampling (simulation) –Rejection sampling –Likelihood weighting

Rutgers CS440, Fall 2003 Hidden Markov models & Kalman filters Hidden Markov models –Representation –Inference ( forward, backward, Viterbi ) Kalman filters –Representation –Inference ( forward )

Rutgers CS440, Fall 2003 Dynamic Bayesian networks Representation Reduction to HMMs (discrete cases) Particle filtering

Rutgers CS440, Fall 2003 Decision making under uncertainty (static) Preferences & utility Utility of money Maximum expected utility principle Decision graphs Value of perfect information

Rutgers CS440, Fall 2003 Markov random processes Decision making in dynamic situations Bellman equations Value iteration Policy iteration

Rutgers CS440, Fall 2003 Decision trees Inductive learning –Test & training set Ockham’s razor Decision trees Representation Learning Attribute selection –Entropy –Information gain Realizable, non-realizable, redundant

Rutgers CS440, Fall 2003 Statistical learning Optimal prediction: Bayesian prediction Maximum likelihood (ML) and maximum a posteriori (MAP) learning ML learning of Bayesian network parameters for complete datasets

Rutgers CS440, Fall 2003 EM & Incomplete data Incomplete/missing data Data completion, completed (log) likelihood Expectation-maximization algorithm

Rutgers CS440, Fall 2003 Neural networks Artificial neurons – perceptron Representation & linear separability Perceptron (gradient) learning Feed-forward, multilayer and recurrent networks (Hopfield)

Rutgers CS440, Fall 2003 Sample Problem 1 Imagine you wish to recognize bad “widgets” produced by your factory. You’re able to measure two numeric properties of each widget: P1 and P2. The value of each property is discretized to be one of {low (L), normal (N), high(H)}. You randomly grab 5 widgets off of your assembly line and extensively test whether or not they are good, obtaining the following results: P1P2Result L N good H L bad N H good L H bad N N good Explain how you could use this data and Bayes’ Rule to determine whether the following new widget is more likely to be a good or a bad one (be sure to show your work and explain any assumptions/simplifications you make): L L ? Solution: Assuming P1 and P2 are conditionally independent, the best prediction is (L,L) -> bad (regardless of whether P1 and P2 have the same or different distributions.)

Rutgers CS440, Fall 2003 Sample Problem 2 Assume that User A and User B equally share a computer, and that you wish to write a program that determines which person is currently using the computer. You choose to create a (first-order) Markov Model that characterizes each user’s typing behavior. You decide to group their keystrokes into three classes and have estimated the transition probabilities, producing the two graphs below. Both users always start in the Other state upon logging in. Now imagine that the current user logs on and immediately types the following: IOU $15 Who is more likely to be the current user? Show and explain your calculations. Lett er Dig it Oth er User B Lett er Dig it Oth er User A Solution: User A.

Rutgers CS440, Fall 2003 Sample Problem 3 ( First-grader Maggie has divided her books into two groups, those she likes and those she doesn’t. The five (5) books that Maggie likes contain (only) the following words: animal (5 times), mineral (15 times), vegetable (1 time), see(1 time) The ten (10) books that Maggie does not like contain (only) the following words: animal (5 times), mineral (10 times), vegetable (30 times), spot(1 time) Using the Naïve Bayes assumption, determine whether it is more probable that Maggie likes the following book than that she dislikes it. Show and explain your work. see mineral vegetable (These three words are the entire contents of this new book.) Solution: Maggie is more likely to like the book (even true if one assumes Maggie does not have a prior preference for liking or disliking books).

Rutgers CS440, Fall 2003 Homework discussion Which grading method is “better”? –Full average (“ave”) –Drop lowest score (“drop”) –Extra credit (“extra”)

Rutgers CS440, Fall 2003 Homework discussion