Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.

Slides:



Advertisements
Similar presentations
Lecture 18 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Advertisements

Probability Probability Principles of EngineeringTM
Lecture Discrete Probability. 5.2 Recap Sample space: space of all possible outcomes. Event: subset of of S. p(s) : probability of element s of.
Great Theoretical Ideas in Computer Science.
7 Probability Experiments, Sample Spaces, and Events
COUNTING AND PROBABILITY
Mathematics.
Probability Fall 2011, CISC1400 Dr. Zhang.
SI485i : NLP Day 2 Probability Review. Introduction to Probability Experiment (trial) Repeatable procedure with well-defined possible outcomes Outcome.
Chapter Two Probability
Copyright © Cengage Learning. All rights reserved. 8.6 Probability.
Chapter 4 Using Probability and Probability Distributions
Lec 18 Nov 12 Probability – definitions and simulation.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Probability Probability Principles of EngineeringTM
1 Discrete Math CS 280 Prof. Bart Selman Module Probability --- Part a) Introduction.
1 Section 5.1 Discrete Probability. 2 LaPlace’s definition of probability Number of successful outcomes divided by the number of possible outcomes This.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 17: 10/24.
Learning Goal 13: Probability Use the basic laws of probability by finding the probabilities of mutually exclusive events. Find the probabilities of dependent.
CHAPTER 6: RANDOM VARIABLES AND EXPECTATION
Fundamentals of Probability
Stat 1510: Introducing Probability. Agenda 2  The Idea of Probability  Probability Models  Probability Rules  Finite and Discrete Probability Models.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Probability.
Chapter 3 Section 3.2 Basic Terms of Probability.
Nor Fashihah Mohd Noor Institut Matematik Kejuruteraan Universiti Malaysia Perlis ІМ ќ INSTITUT MATEMATIK K E J U R U T E R A A N U N I M A P.
S.CP.A.1 Probability Basics. Probability - The chance of an event occurring Experiment: Outcome: Sample Space: Event: The process of measuring or observing.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review Instructor: Anirban Mahanti Office: ICT Class.
Probability The calculated likelihood that a given event will occur
Computing Fundamentals 2 Lecture 6 Probability Lecturer: Patrick Browne
1 CHAPTERS 14 AND 15 (Intro Stats – 3 edition) PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY.
Copyright © Cengage Learning. All rights reserved. 8.6 Probability.
PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY
Rules of Probability. Recall: Axioms of Probability 1. P[E] ≥ P[S] = 1 3. Property 3 is called the additive rule for probability if E i ∩ E j =
(c) 2007 IUPUI SPEA K300 (4392) Probability Likelihood (chance) that an event occurs Classical interpretation of probability: all outcomes in the sample.
Sixth lecture Concepts of Probabilities. Random Experiment Can be repeated (theoretically) an infinite number of times Has a well-defined set of possible.
Introduction Lecture 25 Section 6.1 Wed, Mar 22, 2006.
Probability. What is probability? Probability discusses the likelihood or chance of something happening. For instance, -- the probability of it raining.
Discrete Math Section 16.3 Use the Binomial Probability theorem to find the probability of a given outcome on repeated independent trials. Flip a coin.
9/14/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing Probability AI-Lab
Binomial Probability Theorem In a rainy season, there is 60% chance that it will rain on a particular day. What is the probability that there will exactly.
Chapter 10 PROBABILITY. Probability Terminology  Experiment: take a measurement Like flipping a coin  Outcome: one possible result of an experiment.
1 What Is Probability?. 2 To discuss probability, let’s begin by defining some terms. An experiment is a process, such as tossing a coin, that gives definite.
Virtual University of Pakistan Lecture No. 18 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah.
Essential Ideas for The Nature of Probability
Introduction to Discrete Probability
PROBABILITY AND PROBABILITY RULES
Natural Language Processing
CS104:Discrete Structures
Basic Probability aft A RAJASEKHAR YADAV.
CSCI 5832 Natural Language Processing
Natural Language Processing
Discrete Probability Chapter 7 With Question/Answer Animations
Introduction to Probability
CSCI 5832 Natural Language Processing
Probability Principles of Engineering
Probability Principles of Engineering
Great Theoretical Ideas In Computer Science
Probabilities and Proportions
Statistical Inference for Managers
Probability Probability Principles of EngineeringTM
Probability Principles of Engineering
Conditional Probability, Total Probability Theorem and Bayes’ Rule
Conditional Probability, Total Probability Theorem and Bayes’ Rule
6.2 Probability Models.
Conditional Probability, Total Probability Theorem and Bayes’ Rule
Conditional Probability, Total Probability Theorem and Bayes’ Rule
Presentation transcript:

Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein

Outline Probability Probability  Basic probability  Conditional probability

1. Introduction to Probability Experiment (trial) Experiment (trial)  Repeatable procedure with well-defined possible outcomes Sample Space (S) Sample Space (S) the set of all possible outcomes finite or infinite  Example coin toss experiment possible outcomes: S = {heads, tails}  Example die toss experiment possible outcomes: S = {1,2,3,4,5,6} Slides from Sandiway Fong

Introduction to Probability Definition of sample space depends on what we are asking Definition of sample space depends on what we are asking  Sample Space (S): the set of all possible outcomes  Example die toss experiment for whether the number is even or odd possible outcomes: {even,odd} not {1,2,3,4,5,6}

More definitions Events Events  an event is any subset of outcomes from the sample space Example Example  die toss experiment  let A represent the event such that the outcome of the die toss experiment is divisible by 3  A = {3,6}  A is a subset of the sample space S= {1,2,3,4,5,6} Example Example  Draw a card from a deck suppose sample space S = {heart,spade,club,diamond} (four suits)  let A represent the event of drawing a heart  let B represent the event of drawing a red card  A = {heart}  B = {heart,diamond}

Introduction to Probability Some definitions Some definitions  Counting suppose operation o i can be performed in n i ways, then a sequence of k operations o 1 o 2...o k can be performed in n 1  n 2 ...  n k ways  Example die toss experiment, 6 possible outcomes two dice are thrown at the same time number of sample points in sample space = 6  6 = 36

Definition of Probability The probability law assigns to an event a nonnegative number The probability law assigns to an event a nonnegative number Called P(A) Called P(A) Also called the probability A Also called the probability A That encodes our knowledge or belief about the collective likelihood of all the elements of A That encodes our knowledge or belief about the collective likelihood of all the elements of A Probability law must satisfy certain properties Probability law must satisfy certain properties

Probability Axioms Nonnegativity Nonnegativity  P(A)  0, for every event A Additivity Additivity  If A and B are two disjoint events, then the probability of their union (either one or the other occurs) satisfies:  P(A  B) = P(A) + P(B) Monotonicity Monotonicity  P(A)  P(B) for any A  B Normalization Normalization  The probability of the entire sample space S is equal to 1, i.e. P(S) = 1 A  B = 

An example An experiment involving a single coin toss An experiment involving a single coin toss There are two possible outcomes, H and T There are two possible outcomes, H and T Sample space S is {H,T} Sample space S is {H,T} If coin is fair, should assign equal probabilities to 2 outcomes If coin is fair, should assign equal probabilities to 2 outcomes Since they have to sum to 1 Since they have to sum to 1 P({H}) = 0.5 P({T}) = 0.5 P({H,T}) = P({H}) + P({T}) = 1.0

Another example Experiment involving 3 coin tosses Experiment involving 3 coin tosses Outcome is a 3-long string of H or T Outcome is a 3-long string of H or T S ={HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Assume each outcome is equiprobable Assume each outcome is equiprobable  “Uniform distribution” What is probability of the event that exactly 2 heads occur? What is probability of the event that exactly 2 heads occur? A = {HHT, HTH, THH} P(A) = P({HHT})+P({HTH})+P({THH}) = 1/8 + 1/8 + 1/8 =3/8

Probability definitions In summary: In summary: Probability of drawing a spade from 52 well- shuffled playing cards: Probability of drawing a spade from 52 well- shuffled playing cards:

Probabilities of two events If two events A and B are independent If two events A and B are independent  i.e. P(B) is the same whether P(A) occurred Then Then  P(A and B) = P(A) · P(B) Flip a fair coin twice Flip a fair coin twice  What is the probability that they are both heads? Draw a card from a deck, then put it back, draw a card from the deck again Draw a card from a deck, then put it back, draw a card from the deck again  What is the probability that both drawn cards are hearts?

How about non-uniform probabilities? An example A biased coin, A biased coin,  twice as likely to come up tails as heads,  is tossed twice What is the probability that at least one head occurs? What is the probability that at least one head occurs? Sample space = {hh, ht, th, tt} (h = heads, t = tails) Sample space = {hh, ht, th, tt} (h = heads, t = tails) Sample points/probability for the event: Sample points/probability for the event:  ht 1/3 x 2/3 = 2/9hh 1/3 x 1/3= 1/9  th 2/3 x 1/3 = 2/9tt 2/3 x 2/3 = 4/9 Answer: 5/9 =  0.56 (sum of weights in red) Answer: 5/9 =  0.56 (sum of weights in red) = 1 - 4/9 (prob. of complement) = 1 - 4/9 (prob. of complement)

Computing Probabilities Direct counts (when outcomes are equally probable) Direct counts (when outcomes are equally probable) Sum of union of disjoint events Sum of union of disjoint events  P(A or B) = P(A) + P(B) Product of multiple independent events Product of multiple independent events ·  P(A and B) = P(A) · P(B) Indirect probability: Indirect probability:  P(A) = 1 – P(S – A)

Moving toward language What’s the probability of drawing a 2 from a deck of 52 cards with four 2s? What’s the probability of drawing a 2 from a deck of 52 cards with four 2s? What’s the probability of a random word (from a random dictionary page) being a verb? What’s the probability of a random word (from a random dictionary page) being a verb?

Probability and part of speech tags What’s the probability of a random word (from a random dictionary page) being a verb? What’s the probability of a random word (from a random dictionary page) being a verb? How to compute each of these How to compute each of these All words = just count all the words in the dictionary All words = just count all the words in the dictionary # of ways to get a verb: number of words which are verbs! # of ways to get a verb: number of words which are verbs! If a dictionary has 50,000 entries, and 10,000 are verbs…. P(V) is 10000/50000 = 1/5 =.20 If a dictionary has 50,000 entries, and 10,000 are verbs…. P(V) is 10000/50000 = 1/5 =.20

Conditional Probability A way to reason about the outcome of an experiment based on partial information A way to reason about the outcome of an experiment based on partial information  In a word guessing game the first letter for the word is a “t”. What is the likelihood that the second letter is an “h”?  How likely is it that a person has a disease given that a medical test was negative?  A spot shows up on a radar screen. How likely is it that it corresponds to an aircraft?

More precisely Given an experiment, a corresponding sample space S, and a probability law Given an experiment, a corresponding sample space S, and a probability law Suppose we know that the outcome is within some given event B Suppose we know that the outcome is within some given event B We want to quantify the likelihood that the outcome also belongs to some other given event A We want to quantify the likelihood that the outcome also belongs to some other given event A We need a new probability law that gives us the conditional probability of A given B We need a new probability law that gives us the conditional probability of A given B P(A|B)

An intuition A is “it’s raining now” A is “it’s raining now” P(A) in Tuscany is.01 P(A) in Tuscany is.01 B is “it was raining ten minutes ago” B is “it was raining ten minutes ago” P(A|B) means “what is the probability of it raining now if it was raining 10 minutes ago” P(A|B) means “what is the probability of it raining now if it was raining 10 minutes ago” P(A|B) is probably way higher than P(A) P(A|B) is probably way higher than P(A) Perhaps P(A|B) is.10 Perhaps P(A|B) is.10 Intuition: The knowledge about B should change our estimate of the probability of A. Intuition: The knowledge about B should change our estimate of the probability of A.

Conditional probability One of the following 30 items is chosen at random One of the following 30 items is chosen at random What is P(X), the probability that it is an X ? What is P(X), the probability that it is an X ? What is P(X|red), the probability that it is an X given that it is red? What is P(X|red), the probability that it is an X given that it is red? OXXXOO OXXOXO OOOXOX OOOOXO OXXXXO

S Conditional Probability let A and B be events let A and B be events P(B|A) = the probability of event B occurring given event A occurred P(B|A) = the probability of event B occurring given event A occurred definition: P(B|A) = P(A  B) / P(A) definition: P(B|A) = P(A  B) / P(A) A B

Conditional Probability Note: P(A,B) = P(B|A) · P(A) also: P(A,B) = P(B,A) hence: P(B|A) · P(A) = P(A|B) · P(B) hence: … A B A,BA,B

Bayes’ Theorem P(B) : prior probability P(B) : prior probability P(B|A) : posterior probability P(B|A) : posterior probability

Independence What is P(A, B) if A and B are independent? What is P(A, B) if A and B are independent? P(A,B) = P(A) · P(B) iff A, B independent. P(heads, tails) = P(heads) · P(tails) =.5 ·.5 =.25 Note: P(A|B) = P(A) iff A, B independent Also: P(B|A) = P(B) iff A, B independent

Independent Events P(A) = P(A|B) P(A) = P(A|B)  25/100 = 15/60 P(A  B) = P(A) P(B) P(A  B) = P(A) P(B)  15/100 = 25/100 60/100 S A=25 B=60 15

Monty Hall Problem The contestant is shown three doors. The contestant is shown three doors. Two of the doors have goats behind them and one has a car. Two of the doors have goats behind them and one has a car. The contestant chooses a door. The contestant chooses a door. Before opening the chosen door, Monty Hall opens a door that has a goat behind it. Before opening the chosen door, Monty Hall opens a door that has a goat behind it. The contestant can then switch to the other unopened door, or stay with the original choice. The contestant can then switch to the other unopened door, or stay with the original choice. Which is best? Which is best?

Solution Consider the sample space: door Car, A, B Consider the sample space: door Car, A, B There are three options: There are three options: 1. Contestant chooses Car. If she changes, she loses; if she stays, she wins 2. Contestant chooses A with goat. If she switches, she wins; otherwise she loses. 3. Contestant chooses B with goat. If she switches, she wins; otherwise she loses. Switching gives 2/3 chances of winning Switching gives 2/3 chances of winning

Summary Probability Probability Conditional Probability Conditional Probability Independence Independence

Additional Material