CS 415 – A.I. Slide Set 12. Chapter 5 – Stochastic Learning Heuristic – apply to problems who either don’t have an exact solution, or whose state spaces.

Slides:



Advertisements
Similar presentations
Chapter 2 Probability. 2.1 Sample Spaces and Events.
Advertisements

Week 21 Basic Set Theory A set is a collection of elements. Use capital letters, A, B, C to denotes sets and small letters a 1, a 2, … to denote the elements.
COUNTING AND PROBABILITY
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Combinatorics.
Ch5 Stochastic Methods Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011.
Introduction to Probability: Counting Methods Rutgers University Discrete Mathematics for ECE 14:332:202.
Business and Economics 7th Edition
1 Discrete Math CS 280 Prof. Bart Selman Module Probability --- Part a) Introduction.
1 Section 5.1 Discrete Probability. 2 LaPlace’s definition of probability Number of successful outcomes divided by the number of possible outcomes This.
Chap 4-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 4 Probability.
Class notes for ISE 201 San Jose State University
CSc411Artificial Intelligence1 Chapter 5 STOCHASTIC METHODS Contents The Elements of Counting Elements of Probability Theory Applications of the Stochastic.
Chapter 4 Basic Probability
PROBABILITY (6MTCOAE205) Chapter 2 Probability.
Copyright ©2011 Pearson Education 4-1 Chapter 4 Basic Probability Statistics for Managers using Microsoft Excel 6 th Global Edition.
Chapter 4 Basic Probability
Probability and Probability Distributions
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 2 Probability.
Counting Principles (Permutations and Combinations )
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Combinatorics.
Chapter 1 Basics of Probability.
5.1 Basic Probability Ideas
Stochastic Methods A Review (Mostly). Relationship between Heuristic and Stochastic Methods  Heuristic and stochastic methods useful where –Problem does.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Welcome to Probability and the Theory of Statistics This class uses nearly every type of mathematics that you have studied so far as well as some possibly.
NA387 Lecture 5: Combinatorics, Conditional Probability
Lecture Discrete Probability. 5.1 Probabilities Important in study of complexity of algorithms. Modeling the uncertain world: information, data.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 13, Slide 1 Chapter 13 From Randomness to Probability.
Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.
CHAPTER 5 Probability: Review of Basic Concepts
Discrete Mathematical Structures (Counting Principles)
Stochastic Methods A Review. Some Terms  Random Experiment: An experiment for which the outcome cannot be predicted with certainty  Each experiment.
Copyright ©2014 Pearson Education Chap 4-1 Chapter 4 Basic Probability Statistics for Managers Using Microsoft Excel 7 th Edition, Global Edition.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
Chapter 3 – Set Theory  .
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 9- 1.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review Instructor: Anirban Mahanti Office: ICT Class.
 Review Homework Chapter 6: 1, 2, 3, 4, 13 Chapter 7 - 2, 5, 11  Probability  Control charts for attributes  Week 13 Assignment Read Chapter 10: “Reliability”
CHAPTER 10 Sequences, Induction, & Probability Sequences & Summation Notation Objectives –Find particular terms of sequence from the general term.
CSCE 411 Design and Analysis of Algorithms Set 9: Randomized Algorithms Prof. Jennifer Welch Fall 2014 CSCE 411, Fall 2014: Set 9 1.
Chapter 10 Probability. Experiments, Outcomes, and Sample Space Outcomes: Possible results from experiments in a random phenomenon Sample Space: Collection.
Chap 4-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 4 Using Probability and Probability.
Homework Homework due now. Reading: relations
Lesson 6 – 2b Probability Models Part II. Knowledge Objectives Explain what is meant by random phenomenon. Explain what it means to say that the idea.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving STOCHASTIC METHODS Luger: Artificial Intelligence,
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Week 11 What is Probability? Quantification of uncertainty. Mathematical model for things that occur randomly. Random – not haphazard, don’t know what.
Copyright © 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Topic 2: Intro to probability CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
Basic Business Statistics Assoc. Prof. Dr. Mustafa Yüzükırmızı
PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY
From Randomness to Probability Chapter 14. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,
ICS 253: Discrete Structures I Discrete Probability King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 4-1 Chapter 4 Basic Probability Basic Business Statistics 11 th Edition.
Some other set notation We will use the notation to mean that e is an element of A. We will use the notation to mean that e is not an element of A.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Probability theory is the branch of mathematics concerned with analysis of random phenomena. (Encyclopedia Britannica) An experiment: is any action, process.
Copyright © 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Basic Probability. Introduction Our formal study of probability will base on Set theory Axiomatic approach (base for all our further studies of probability)
George F Luger ARTIFICIAL INTELLIGENCE 5th edition Structures and Strategies for Complex Problem Solving STOCHASTIC METHODS Luger: Artificial Intelligence,
Chap 4-1 Chapter 4 Using Probability and Probability Distributions.
Chapter 4 Basic Probability.
What is Probability? Quantification of uncertainty.
Statistics for 8th Edition Chapter 3 Probability
Chapter 4 Basic Probability.
Independence and Counting
Presentation transcript:

CS 415 – A.I. Slide Set 12

Chapter 5 – Stochastic Learning Heuristic – apply to problems who either don’t have an exact solution, or whose state spaces are prohibitively large Stochastic Methodology – also good for these situations  Based on counting the elements of an application domain

Addition and Multiplication Rule Set A |A| - cardinality of A (number of elts)‏  A could be: empty, finite, countably infinite, or uncountably infinite U – Universe (a.k.a. Domain)‏  The set of ALL elements that could be in A A’ - Compliment Example  U – people in a room  A – males from U  A’ – females in the room

Other Notations Subset, Union, Intersection

Permutations and Combinations Permutation – an arranged sequence of elements of a set (each used only once)‏  Question: how many unique permutations are there of a set of size n?  n * n-1 * n-2 * n-3 * … * 1  Question: how many ways can we arrange a set of 10 books on a shelf where only 6 books can fit?  nPr

Combination – Any subset of the elements that can be formed  Question: How many combinations given a set of items?  1 combination for n elements  Order DOES NOT MATTER  Question: How many combinations taken r at a time (How many ways can I form a four person committee from 10 people?)

Elements of Probability Theory

Examples What is the probability of rolling a 7 or 11 from two fair die?  Sample Space Size?  36  Event Size?  8   For 6: 1,6; 2,5; 3,4; 4,3; 5,2; 6,1  For 11: 10,1; 1,10  Probability  8/36 = 2/9  Add them together because they are “independent”

How many four-of-a-kind hands can be dealt in all possible five card hands?  Sample Space?  52 cards taken 5 at a time  Event Space?  Multiply number of combinations of 13 cards 1 at a time * Combination of 4 taken 4 at a time * 48  (number of different kinds of cards) * (number of ways to pick all four cards of same kind) * (times the number of ways the fifth card can be chosen)‏  See top of pg 172  Approx

The probability of any event E from the sample space S is: The sum of the probabilities of all possible outcomes is 1 The probability of the compliment of an event is The probability of the contradictory or false outcome of an event Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

9

Are Two Events Independent? Random bit strings of length four 2 Events 1. String has even number of ones 2. Bit string ends with a zero A total of 24=16 bit strings – 8 strings end with zero – 8 strings have even number of ones Are independent events

All of Probability Theory In a Nut Shell

Probabilistic Inference: Example 3 boolean random variables  All either true or false  S – traffic is slowing down  A – probability of an accident  C – probability of road construction  Given state traffic data in Table 5.1  Next slide  Note: all possibilities sum to 1  Can use these numbers to calculate  Probability traffic slowdown  Probability of construction without slowdown, etc.

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 Table 5.1The joint probability distribution for the traffic slowdown, S, accident, A, and construction, C, variable of the example of Section Fig 5.1 A Venn diagram representation of the probability distributions of Table 5.1; S is traffic slowdown, A is accident, C is construction. 11

Random Variables Individual Probability Computation 1. Combinatorical Methods (Analytical)‏  Ex: Probability of rolling a 5 on a 6-sided die 2. Sampling Events (Empirical)‏  For times when it isn’t that simple to analyze  Assumptions  Not all events are equally likely  (Easier if they are)‏  Probability of event lies between 0 and 1  Probabilities of union of sets still holds  Use “Random Variables”

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

Random Variable Example Random variable – Season  Domain – {spring, summer, fall, winter} Discrete Random Variable  p(Season=spring) =.75 Boolean Random Variable  p(Season=spring) = true

Expectation Expectation – the notion of expected payoff or cost of an outcome

Example A fair roulette wheel  Integers 0 – 36 equally spaced  Each player places $1 on any number  Wheel stops on that number, wins $35  Else – loses the $1 Reward of win $35  Probability 1/37 Cost of loss $1  Probability 36/37 Ex(E) = 35(1/37) + (-1)(36/37) =

Conditional Probability 2 kinds of Probablities 1. Prior Probabilities  What’s the probability of getting a 2 or a 3 on a fair die? 2. Conditional (Posterior) Probabilities  If a patient has system X, Y and Z then what is the probability that he has the flu?

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 Fig 5.2A Venn diagram illustrating the calculations of P(d|s) as a function of p(s|d). 15

The chain rule for two sets: The generalization of the chain rule to multiple sets Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 We make an inductive argument to prove the chain rule, consider the nth case: We apply the intersection of two sets of rules to get: And then reduce again, considering that: Until is reached, the base case, which we have already demonstrated. 16

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

You say [to ow m ey tow] and I say [t ow m aa t ow]… - Ira Gershwin, Lets Call The Whole Thing Off Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 Fig 5.3A probabilistic finite state acceptor for the pronunciation of “tomato”, adapted from Jurafsky and Martin (2000). 19

Phoneme Recognition Problem Use the Tomato style stochastic finite state acceptor  Interpret ambiguous collections of phonemes  See how well the phonemes match the path through the state machine for this and other words phoneme is the smallest structural unit that distinguishes meaning. Phonemes are not the physical segments themselves, but, in theoretical terms, cognitive abstractions or categorizations of them.  the /t/ sound in the words tip, stand, water, and cat

Phoneme Recognition Problem Suppose an algorithm has identified the phoneme /ni/  Occurs just after other recognized speech, /l/ Need to associate phoneme with either a word or the first part of a word How?  Brown corpus  1 Million word collection of sentences from 500 texts  Switchboard corpus  1.4 Million word collection of phone conversations  Together: ~2.5 Million words that let us sample written and spoken words

How to proceed? Can determine which word with the phoneme is most likely used  See Table 5.2  Most likely, “the”

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 Table 5.2 The ni words with their frequencies and probabilities from the Brown and Switchboard corpora of 2.5M words, adapted from Jurafsky and Martin (2000). 20

Use Bayes’ theorem  p(word | [ni]) = p([ni]|word) x p(word)‏  See Table 5.3  Most likely, “new”  “I new” doesn’t make sense  “I need”, does

Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 Table 5.3 The ni phone/word probabilities from the Brown and Switchboard corpora (Jurafsky and Martin, 2000). 21

Bayes’ Theorem Review: one disease and one symptom  Individual hypotheses, hi  Each is disjoint  Set of hypotheses, H  Set of evidence, E P(hi|E) = (p(E|hi) x p(hi))/p(E)‏ Can use this to determine which hypothesis is strongest given E  Drop the denominator  Arg max (maximum likelihood) hypothesis

Finding p(E)‏ Given: entire space is partitioned by the set of hypotheses hi  Partition of a set = split of set into disjoint subsets p(E) = Σip(E|hi)p(hi)‏

Bayes’ Theorem : General Form The general form of Bayes’ theorem where we assume the set of hypotheses H partition the evidence set E :

Example Suppose you want to buy a car  Prob go to dealer 1, d1  Prob purchasing a car at d1, a1 Necessary for using Bayes’  All probabilities with various hi must be known  All relationships between evidence and hypothesis {p(E|hi)} must be known

The application of Bayes’ rule to the car purchase problem: Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,

Naïve Bayes, or the Bayes classifier, that uses the partition assumption, even when it is not justified: Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited,