1 Probability theory LING 570 Fei Xia Week 2: 10/01/07.

Slides:



Advertisements
Similar presentations
Statistical NLP Course for Master in Computational Linguistics 2nd Year Diana Trandabat.
Advertisements

1 Essential Probability & Statistics (Lecture for CS598CXZ Advanced Topics in Information Retrieval ) ChengXiang Zhai Department of Computer Science University.
Aim: What are the models of probability?. What is a probability model? Probability Model: a description of a random phenomenon in the language of mathematics.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
June 3, 2008Stat Lecture 6 - Probability1 Probability Introduction to Probability, Conditional Probability and Random Variables Statistics 111 -
1 BASIC NOTIONS OF PROBABILITY THEORY. NLE 2 What probability theory is for Suppose that we have a fair dice, with six faces, and that we keep throwing.
Introduction to Probability
Chapter 4 Using Probability and Probability Distributions
Maximum Entropy Model (I) LING 572 Fei Xia Week 5: 02/05-02/07/08 1.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Probability Probability Principles of EngineeringTM
Visualizing Events Contingency Tables Tree Diagrams Ace Not Ace Total Red Black Total
Random variables Random experiment outcome numerical measure/aspect of outcome Random variable S outcome R number Random variable.
Introduction LING 572 Fei Xia Week 1: 1/3/06. Outline Course overview Problems and methods Mathematical foundation –Probability theory –Information theory.
Random Variables and Distributions Lecture 5: Stat 700.
1 Languages, grammars, and regular expressions LING 570 Fei Xia Week 2: 10/03/07 TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
P robability Sample Space 郭俊利 2009/02/27. Probability 2 Outline Sample space Probability axioms Conditional probability Independence 1.1 ~ 1.5.
CEEN-2131 Business Statistics: A Decision-Making Approach CEEN-2130/31/32 Using Probability and Probability Distributions.
Probability and Statistics Review Thursday Sep 11.
Probability Distributions: Finite Random Variables.
CHAPTER 10: Introducing Probability
Expected Value (Mean), Variance, Independence Transformations of Random Variables Last Time:
Probability and Probability Distributions
2. Mathematical Foundations
Stat 1510: Introducing Probability. Agenda 2  The Idea of Probability  Probability Models  Probability Rules  Finite and Discrete Probability Models.
Copyright ©2011 Nelson Education Limited. Probability and Probability Distributions CHAPTER 4 Part 2.
Probability Distributions - Discrete Random Variables Outcomes and Events.
1 Foundations of Statistical Natural Language Processing By Christopher Manning & Hinrich Schutze Course Book.
PROBABILITY David Kauchak CS159 – Spring Admin  Posted some links in Monday’s lecture for regular expressions  Logging in remotely  ssh to vpn.cs.pomona.edu.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review Instructor: Anirban Mahanti Office: ICT Class.
Chap 4-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 4 Using Probability and Probability.
November 2004CSA4050: Crash Concepts in Probability1 CSA4050: Advanced Topics in NLP Probability I Experiments/Outcomes/Events Independence/Dependence.
Uncertainty Uncertain Knowledge Probability Review Bayes’ Theorem Summary.
4.1 Probability Distributions Important Concepts –Random Variables –Probability Distribution –Mean (or Expected Value) of a Random Variable –Variance and.
CHAPTER 10: Introducing Probability ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
NLP. Introduction to NLP Very important for language processing Example in speech recognition: –“recognize speech” vs “wreck a nice beach” Example in.
5.1 Randomness  The Language of Probability  Thinking about Randomness  The Uses of Probability 1.
12/7/20151 Probability Introduction to Probability, Conditional Probability and Random Variables.
Introduction  Probability Theory was first used to solve problems in gambling  Blaise Pascal ( ) - laid the foundation for the Theory of Probability.
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Basic of UNIX For fresh members of SPARCS
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
Probability Distributions
9/14/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing Probability AI-Lab
Probability Theory Longin Jan Latecki Temple University Slides based on slides by Aaron Hertzmann, Michael P. Frank, and Christopher Bishop.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Chap 4-1 Chapter 4 Using Probability and Probability Distributions.
Statistical NLP Course for Master in Computational Linguistics 2nd Year Diana Trandabat.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
UNIX To do work for the class, you will be using the Unix operating system. Once connected to the system, you will be presented with a login screen. Once.
Probability David Kauchak CS158 – Fall 2013.
Chapter 4 Using Probability and Probability Distributions
PROBABILITY AND PROBABILITY RULES
Natural Language Processing
Quick Review Probability Theory
Quick Review Probability Theory
Reference: (Material source and pages)
Basic Probability aft A RAJASEKHAR YADAV.
Natural Language Processing
Review of Probability and Estimators Arun Das, Jason Rebello
Introduction to Probability
CS 60 Discussion Review.
Advanced Artificial Intelligence
Statistical NLP: Lecture 4
Lesson 10.1 Sample Spaces and Probability
CHAPTER 10: Introducing Probability
WARM - UP After an extensive review of weather related accidents an insurance company concluded the following results: An accident has a 70% chance of.
Probability overview Event space – set of possible outcomes
Tesla slashed Model S and X staff in recent layoffs
6.2 Probability Models.
Presentation transcript:

1 Probability theory LING 570 Fei Xia Week 2: 10/01/07

2 Misc. Patas account and dropbox Course website, “Collect it”, and GoPost. Mailing list –Received message on Thursday? Questions about hw1?

3 Outline Quiz #1 Unix commands Linguistics Elementary Probability theory: M&S 2.1

4 Quiz #1 Five areas: weight ave Programming: 4.0 (3.74) –Try Perl or Python Unix commands: 1.2 (0.99) Probability: 2.0 (1.09) Regular expression: 2.0 (1.62) Linguistics knowledge: 0.8 (0.71)

5 Results : : 8 < 8.0: 8

6 Unix commands ls (list), cp (copy), rm (remove) more, less, cat cd, mkdir, rmdir, pwd chmod: to change file permission tar, gzip: to tar/zip files ssh, sftp: to log on or ftp files man: to learn a command

7 Unix commands (cont) compilers: javac, gcc, g++, perl, … ps, top, which Pipe: cat input_file | eng_tokenizer.sh | make_voc.sh > output_file sort, unique, awk, grep grep “the” voc | awk ‘{print $2}’ | sort | uniq –c | sort -nr

8 Examples Set the permission of foo.pl so it is readable and executable by the user and the group. rwx rwx rwx => chmod 550 foo.pl Move a file, foo.pl, from your home dir to /tmp mv ~/foo.pl /tmp

9 Linguistics: POS tags Open class: Noun, verb, adjective, adverb –Auxiliary verb/modal: can, will, might,.. –Temporal noun: tomorrow –Adverb: adj+ly, always, still, not, … Closed class: Preposition, conjunction, determiner, pron, –Conjunction: CC (and), SC (if, although) –Complementizer: that,

10 Linguistics: syntactic structure Two kinds: –Phrase structure (a.k.a. parse tree): –Dependency structure Examples: –John said that he would call Mary tomorrow

11 Outline Quiz #1 Unix commands Linguistics Elementary Probability theory

12 Probability Theory

13 Basic concepts Sample space, event, event space Random variable and random vector Conditional probability, joint probability, marginal probability (prior)

14 Sample space, event, event space Sample space (Ω): the set of all possible outcomes. –Ex: toss a coin three times: {HHH, HHT, HTH, HTT, …} Event: an event is a subset of Ω. –Ex: an event is {HHT, HTH, THH} Event space (2 Ω ): the set of all possible events.

15 Probability function A probability function (a.k.a. a probability distribution) distributes a probability mass of 1 throughout the sample space . It is a function from 2  ! [0,1] such that: P(  ) = 1 For any disjoint sets A j 2 2 , P(  A j ) =  P(A j ) - Ex: P({HHT, HTH, HTT}) = P({HHT}) + P({HTH}) + P({HTT})

16 The coin example The prob of getting a head is 0.1 for one toss. What is the prob of getting two heads out of three tosses? P(“Getting two heads”) = P({HHT, HTH, THH}) = P(HHT) + P(HTH) + P(THH) = 0.1*0.1* *0.9* *0.1*0.1 = 3*0.1*0.1*0.9

17 Random variable The outcome of an experiment need not be a number. We often want to represent outcomes as numbers. A random variable X is a function: Ω  R. –Ex: the number of heads with three tosses: X(HHT)=2, X(HTH)=2, X(HTT)=1, …

18 The coin example (cont) X = the number of heads with three tosses P(X=2) = P({HHT, HTH, THH}) = P({HHT}) + P({HTH}) + P({THH})

19 Two types of random variables Discrete: X takes on only a countable number of possible values. –Ex: Toss a coin three times. X is the number of heads that are noted. Continuous: X takes on an uncountable number of possible values. –Ex: X is the speed of a car

20 Common trick #1: Maximum likelihood estimation An example: toss a coin 3 times, and got two heads. What is the probability of getting a head with one toss? Maximum likelihood: (ML)  * = arg max  P(data |  ) In the example, –P(X=2) = 3 * p * p * (1-p) e.g., the prob is 3/8 when p=1/2, and is 12/27 when p=2/3 3/8 < 12/27

21 Random vector Random vector is a finite-dimensional vector of random variables: X=[X 1,…,X k ]. P(x) = P(x 1,x 2,…,x n )=P(X 1 =x 1,…., X n =x n ) Ex: P(w 1, …, w n, t 1, …, t n )

22 Notation X, Y, X i, Y i are random variables. x, y, x i are values. P(X=x) is written as P(x) P(X=x | Y=y) is written as P(x | y).

23 Three types of probability Joint prob: P(x,y)= prob of X=x and Y=y happening together Conditional prob: P(x | y) = prob of X=x given a specific value of Y=y Marginal prob: P(x) = prob of X=x for all possible values of Y.

24 An example There are two coins. Choose a coin and then toss it. Do that 10 times. Coin 1 is chosen 4 times: one head and three tails. Coin 2 is chosen six times: four heads and two tails. Let’s calculate the probabilities.

25 Probabilities P(C=1) = 4/10, P(C=2) = 6/10 P(X=h) = 5/10, P(X=t) = 5/10 P(X=h | C=1) = ¼, P(X=h |C=2) =4/6 P(X=t | C=1) = ¾, P(X=t |C=2) = 2/6 P(X=h, C=1) =1/10, P(X=h, C=2)= 4/10 P(X=t, C=1) = 3/10, P(X=t | C=2) = 2/10

26 Relation between different types of probabilities P(X=h, C=1) = P(C=1) * P(X=h | C=1) = 4/10 * ¼ = 1/10 P(X=h) = P(X=h, C=1) + P(X=h, C=2) = 1/10 + 4/10 = 5/10

27 Common trick #2: Chain rule

28 Common trick #3: joint prob  Marginal prob

29 Common trick #4: Bayes’ rule

30 Independent random variables Two random variables X and Y are independent iff the value of X has no influence on the value of Y and vice versa. P(X,Y) = P(X) P(Y) P(Y|X) = P(Y) P(X|Y) = P(X) Our previous examples: P(X, C) != P(X) P(C)

31 Conditional independence Once we know C, the value of A does not affect the value of B and vice versa. P(A,B | C) = P(A|C) P(B|C) P(A|B,C) = P(A | C) P(B|A, C) = P(B |C)

32 Independence and conditional independence If A and B are independent, are they conditional independent? Example: –Burglar, Earthquake –Alarm

33 Common trick #5: Independence assumption

34 An example P(w 1 w 2 … w n ) = P(w 1 ) P(w 2 | w 1 ) P(w 3 | w 1 w 2 ) * … * P(w n | w 1 …, w n-1 ) ¼ P(w 1 ) P(w 2 | w 1 ) …. P(w n | w n-1 ) Why do we make independence assumption which we know are not true?

35 Summary of elementary probability theory Basic concepts: sample space, event space, random variable, random vector Joint / conditional /marginal probability Independence and conditional independence Five common tricks: –Max likelihood estimation –Chain rule –Calculating marginal probability from joint probability –Bayes’ rule –Independence assumption

36 Outline Quiz #1 Unix commands Linguistics Elementary Probability theory

37 Next time J&M Chapt 2 –Formal language and formal grammar –Regular expression Hw1 is due at 3pm on Wed.