CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative.

Slides:



Advertisements
Similar presentations
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Advertisements

Executive Summary: Bayes Nets Andrew W. Moore Carnegie Mellon University, School of Computer Science Note to other teachers.
An introduction to time series approaches in biosurveillance Professor The Auton Lab School of Computer Science Carnegie Mellon University
MA 102 Statistical Controversies Monday, April 1, 2002 Today: Randomness and probability Probability models and rules Reading (for Wednesday): Chapter.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 12 Jim Martin.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Copyright © 2004, Andrew W. Moore Naïve Bayes Classifiers Andrew W. Moore Professor School of Computer Science Carnegie Mellon University
511 Friday Feb Math/Stat 511 R. Sharpley Lecture #15: Computer Simulations of Probabilistic Models.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 16, Slide 1 Chapter 16 Probability Models.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
CPSC 322, Lecture 24Slide 1 Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15,
Learning Objective Chapter 13 Data Processing, Basic Data Analysis, and Statistical Testing of Differences CHAPTER thirteen Data Processing, Basic Data.
Probability Models Chapter 17.
PBG 650 Advanced Plant Breeding
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Apr. 3, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 3: Review of Digital Circuits and Logic Design Jeremy R. Johnson Mon. Apr.
CS 312: Algorithm Analysis Lecture #8: Non-Homogeneous Recurrence Relations This work is licensed under a Creative Commons Attribution-Share Alike 3.0.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
1 CS 391L: Machine Learning: Bayesian Learning: Naïve Bayes Raymond J. Mooney University of Texas at Austin.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
CS 312: Algorithm Design & Analysis Lecture #12: Average Case Analysis of Quicksort This work is licensed under a Creative Commons Attribution-Share Alike.
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
SESSION 31 & 32 Last Update 14 th April 2011 Discrete Probability Distributions.
Copyright © 2009 Pearson Education, Inc. Chapter 17 Probability Models.
Math b (Discrete) Random Variables, Binomial Distribution.
Announcements Project 4: Ghostbusters Homework 7
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)
Sep 10th, 2001Copyright © 2001, Andrew W. Moore Learning Gaussian Bayes Classifiers Andrew W. Moore Associate Professor School of Computer Science Carnegie.
A Quick Overview of Probability William W. Cohen Machine Learning
Lecture #9: Introduction to Markov Chain Monte Carlo, part 3
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7B PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( POISSON DISTRIBUTION)
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
CS 312: Algorithm Analysis Lecture #31: Linear Programming: the Simplex Algorithm, part 2 This work is licensed under a Creative Commons Attribution-Share.
CS 312: Algorithm Analysis Lecture #31: Linear Programming: the Simplex Algorithm, part 2 This work is licensed under a Creative Commons Attribution-Share.
Chapter 17 Probability Models.
Oct 15th, 2001Copyright © 2001, Andrew W. Moore Bayes Nets for representing and reasoning about uncertainty Andrew W. Moore Associate Professor School.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Nov 30th, 2001Copyright © 2001, Andrew W. Moore PAC-learning Andrew W. Moore Associate Professor School of Computer Science Carnegie Mellon University.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS ROHANA BINTI ABDUL HAMID INSTITUT.
Introduction A probability distribution is obtained when probability values are assigned to all possible numerical values of a random variable. It may.
Oct 29th, 2001Copyright © 2001, Andrew W. Moore Bayes Net Structure Learning Andrew W. Moore Associate Professor School of Computer Science Carnegie Mellon.
3.1 Statistical Distributions. Random Variable Observation = Variable Outcome = Random Variable Examples: – Weight/Size of animals – Animal surveys: detection.
 A probability function - function when probability values are assigned to all possible numerical values of a random variable (X).  Individual probability.
CS 312: Algorithm Design & Analysis Lecture #26: 0/1 Knapsack This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.Creative.
CS 312: Algorithm Analysis Lecture #30: Linear Programming: Intro. to the Simplex Algorithm This work is licensed under a Creative Commons Attribution-Share.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Chapter 6 – Continuous Probability Distribution Introduction A probability distribution is obtained when probability values are assigned to all possible.
Basics of Multivariate Probability
CHAPTER 13 Data Processing, Basic Data Analysis, and the Statistical Testing Of Differences Copyright © 2000 by John Wiley & Sons, Inc.
Markov Systems, Markov Decision Processes, and Dynamic Programming
Appendix A: Probability Theory
Probabilistic Reasoning
Jeremy R. Johnson Wed. Sept. 29, 1999
CS 4/527: Artificial Intelligence
Inferential Statistics and Probability a Holistic Approach
Maximum Likelihood Find the parameters of a model that best fit the data… Forms the foundation of Bayesian inference Slide 1.
Probabilistic Reasoning
Propagation Algorithm in Bayesian Networks
Statistical NLP: Lecture 4
CS 188: Artificial Intelligence Fall 2008
CAP 5636 – Advanced Artificial Intelligence
Probabilistic Reasoning
Chapter 17 Probability Models.
Probabilistic Reasoning
Support Vector Machines
Inferential Statistics and Probability a Holistic Approach
Presentation transcript:

CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.Creative Commons Attribution-Share Alike 3.0 Unported License Some slides (14-onward) adapted from slides originally created by Andrew W. Moore of CMU. See message on slide #14.

Announcements  Assignment 0.1  Due today  Reading Report #2  Due Wednesday  Assignment 0.2  Mathematical exercises  Early: Friday  Due next Monday

Objectives  Understand 4 important discrete distributions  Describe uncertain worlds with joint probability distributions  Reel with terror at the intractability of reasoning with joint distributions  Prepare to build models of natural phenomena as Bayes Nets

Parametric Distributions

e.g., Normal Distribution

Bernoulli Distribution  2 possible outcomes “What’s the probability of a single binary event x, if a ‘positive’ event has probability p?”

Categorical Distribution  Extension for m possible outcomes “What’s the probability of a single event x (containing a 1 in only one position), if outcomes 1, 2, …, and m, are specified by p = [p 1, p 2, …, p m ]?”  Note: p i must sum to 1

Categorical Distribution  Extension for m possible outcomes “What’s the probability of a single event x (containing a 1 in only one position), if outcomes 1, 2, …, and m, are specified by p = [p 1, p 2, …, p m ]?”  Note: p i must sum to 1  Equivalently:  Great for language models, where each value corresponds to a word or an n-gram of words. (e.g., value ‘1’ corresponds to ‘the’)

Binomial Distribution  2 possible outcomes; N trials “What’s the probability in N independent Bernoulli events that x of them will come up ‘positive’, if a ‘positive’ event has probability p?”

Multinomial Distribution  Extension for m possible outcomes; N trials “What’s the probability in N independent categorical events that value 1 will occur x 1 times … and that value m will occur x m times, if the probabilities of each possible value are specified by p = [p 1, p 2, …, p m ]?”  Note: p i must sum to 1

Acknowledgments Note to other teachers and users of the following slides: Andrew Moore would be delighted if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: Comments and corrections gratefully received.

Why Bayes Nets Matter  Andrew Moore (Google, formerly CMU): One of the most important technologies in the Machine Learning / AI field to have emerged in the last 20 years  A clean, clear, manageable language  Express what you’re certain and uncertain about  Many practical applications in medicine, factories, helpdesks, robotics, and NLP! Inference: P(diagnosis | these symptoms) Anomaly Detection: anomalousness of this observation Active Data Collection: next diagnostic test | current observations

The Joint Distribution Recipe for making a joint distribution of M variables: Example: Boolean variables X, Y, Z

The Joint Distribution Recipe for making a joint distribution of M variables: 1.Make a truth table listing all combinations of values of your variables (if there are M Boolean variables, then the table will have 2 M rows). XYZ Example: Boolean variables X, Y, Z

The Joint Distribution Recipe for making a joint distribution of M variables: 1.Make a truth table listing all combinations of values of your variables (if there are M Boolean variables, then the table will have 2 M rows). 2.For each combination of values, indicate the probability. XYZProb Example: Boolean variables X, Y, Z

XYZProb X=1 Y=1 Z= The Joint Distribution Example: Boolean variables X, Y, Z

Using the Joint Distribution

Once you have the joint dist., you can ask for the probability of any logical expression involving any of the “attributes”. What is this summation called?

P(Poor, Male) = Using the Joint Distribution

P(Poor, Male) = Using the Joint Distribution

P(Poor) = Using the Joint Distribution Try this.

P(Poor) = Using the Joint Distribution

Inference with the Joint Dist.

P(Male | Poor) = Inference with the Joint Dist.

P(Male | Poor) = / = Inference with the Joint Dist.

News Good Once you have a joint distribution, you can ask important questions about uncertain events. Bad Impossible to create for more than about ten attributes because there are so many numbers needed when you build the distribution.

Next  Address our efficiency problem by making independence assumptions!  Use the Bayes Net methodology to build joint distributions in manageable chunks