Lecture 1, Part 2 Albert Gatt Corpora and statistical methods.

Slides:



Advertisements
Similar presentations
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Advertisements

7.3 Introduction to Probability This presentation is copyright by Paul Hendrick © , Paul Hendrick All rights reserved.
1 Chapter 3 Probability 3.1 Terminology 3.2 Assign Probability 3.3 Compound Events 3.4 Conditional Probability 3.5 Rules of Computing Probabilities 3.6.
Randomness and Probability
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE 11 (Lab): Probability reminder.
From Randomness to Probability
Unit 4 Sections 4-1 & & 4-2: Sample Spaces and Probability  Probability – the chance of an event occurring.  Probability event – a chance process.
MAT 103 Probability In this chapter, we will study the topic of probability which is used in many different areas including insurance, science, marketing,
Copyright © Cengage Learning. All rights reserved.
Copyright © Cengage Learning. All rights reserved. 8.6 Probability.
Probability & Counting Rules Chapter 4 Created by Laura Ralston Revised by Brent Griffin.
7.1 Sample space, events, probability
1 BASIC NOTIONS OF PROBABILITY THEORY. NLE 2 What probability theory is for Suppose that we have a fair dice, with six faces, and that we keep throwing.
Chapter 4 Using Probability and Probability Distributions
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
September SOME BASIC NOTIONS OF PROBABILITY THEORY Universita’ di Venezia 29 Settembre 2003.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 17: 10/24.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Probability and Probability Distributions
Probability.
Chapter 4 Probability See.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 13, Slide 1 Chapter 13 From Randomness to Probability.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Probability Rules!. ● Probability relates short-term results to long-term results ● An example  A short term result – what is the chance of getting a.
Chapter 3 Section 3.2 Basic Terms of Probability.
From Randomness to Probability
College Algebra Sixth Edition James Stewart Lothar Redlin Saleem Watson.
Probability Section 7.1.
Basic Concepts of Probability Coach Bridges NOTES.
Copyright © 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Probability Introduction Examples Key words Practice questions Venn diagrams.
1 CHAPTERS 14 AND 15 (Intro Stats – 3 edition) PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY.
Copyright © Cengage Learning. All rights reserved. 8.6 Probability.
Chapter 3 Probability Larson/Farber 4th ed. Chapter Outline 3.1 Basic Concepts of Probability 3.2 Conditional Probability and the Multiplication Rule.
Applicable Mathematics “Probability” Page 113. Definitions Probability is the mathematics of chance. It tells us the relative frequency with which we.
Probability Section 7.1. What is probability? Probability discusses the likelihood or chance of something happening. For instance, -- the probability.
Copyright © 2010 Pearson Education, Inc. Unit 4 Chapter 14 From Randomness to Probability.
From Randomness to Probability Chapter 14. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,
Probability Basic Concepts Start with the Monty Hall puzzle
Lecture 1, Part 2 Albert Gatt Corpora and statistical methods.
Probability. What is probability? Probability discusses the likelihood or chance of something happening. For instance, -- the probability of it raining.
The Most Interesting Statistics From 2014 | RealClearMarkets On average, children run a mile 90 seconds slower than their counterparts 30 years ago. Nine.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Probability Chapter 3. § 3.1 Basic Concepts of Probability.
Unit 4 Section 3.1.
Chapter 4 Probability 4-1 Review and Preview 4-2 Basic Concepts of Probability 4-3 Addition Rule 4-4 Multiplication Rule: Basics 4-5 Multiplication Rule:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Copyright © 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Section Constructing Models of Random Behavior Objectives: 1.Build probability models by observing data 2.Build probability models by constructing.
Chapter 4 Probability and Counting Rules. Introduction “The only two sure things are death and taxes” A cynical person once said.
AP Statistics From Randomness to Probability Chapter 14.
1 What Is Probability?. 2 To discuss probability, let’s begin by defining some terms. An experiment is a process, such as tossing a coin, that gives definite.
Virtual University of Pakistan Lecture No. 18 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah.
Probability and Probability Distributions. Probability Concepts Probability: –We now assume the population parameters are known and calculate the chances.
PROBABILITY Probability Concepts
Chapter 4 Probability Concepts
Bring a penny to class tomorrow
Subtopic : 10.1 Events and Probability
PROBABILITY AND PROBABILITY RULES
Probability and Statistics Chapter 3 Notes
From Randomness to Probability
Lecture 11 Sections 5.1 – 5.2 Objectives: Probability
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Honors Statistics From Randomness to Probability
Probability Probability Principles of EngineeringTM
6.1 Sample space, events, probability
From Randomness to Probability
Presentation transcript:

Lecture 1, Part 2 Albert Gatt Corpora and statistical methods

In this part… CSA Corpora and Statistical Methods We begin with some basic probability theory: the concept of an experiment three conceptions of probability: classical, frequency-based interpretation long-run, relative frequency interpretation subjective (bayesian) interpretation rules of probability (part 1)

The concept of an experiment and classical probability

Experiments CSA Corpora and Statistical Methods The simplest conception of an experiment consists in: a set of events of interest – possible outcomes simple: (e.g. probability of getting any of the six numbers when we throw a die) compound: (e.g. probability of getting an even number when we throw a die) uncertainty about the actual outcome This is a very simple conception. Research experiments are considerably more complex. Probability is primarily about uncertainty of outcomes.

The classic example: Flipping a coin CSA Corpora and Statistical Methods We flip a fair coin. (our “experiment”) What are the possible outcomes? Heads (H) or Tails (T) Either is equally likely What are the chances of getting H? One out of two P(H) = ½ = 0.5

Another example: compound outcome CSA Corpora and Statistical Methods We roll a die. What are the chances of getting an even number? There are six possible outcomes from rolling a die, each with a 1 out of 6 chance There are 3 simple outcomes of interest making up the compound event of interest: even numbers: {2, 4, 6} any of these qualifies as “success” in our exp. effectively, we can be successful 3 times out of 6. P(Even) = 3/6 = 0.5

Yet another example CSA Corpora and Statistical Methods We write a random number generator, which generates numbers randomly between 0 and 200. Numbers can be decimals Valid outcomes: 0, , 1.1, 4… NB: The set of possible outcomes is infinite uncountable (continuous)

Some notation… CSA Corpora and Statistical Methods We use Ω to denote the total set of outcomes, our event space Can be infinite! (cf. the random number generator) discrete event space: events can be identified individually (throw of dice) continuous event space: events fall on a continuum (number generator) We view events and outcomes as sets

Venn diagram representation of the dice- throw example CSA Corpora and Statistical Methods Possible outcomes: {1,2,3,4,5,6} Outcomes of interest (denoted A): {2,4,6} Ω A

Probability: classical interpretation Given n equally possible outcomes, and m events of interest, the probability that one of the m events occurs is m/n. If we call our set of events of interest A, then: Principle of insufficient reason (Laplace): We should assume that events are equally likely, unless there is good reason to believe they are not. CSA Corpora and Statistical Methods Number of events of interest (A) over total number of events

Compound vs. simple events If A is a compound event, then P(A) is the sum of the probabilities of the simple events making it up: Recall, that P(Even) = 3/6 = 0.5 In a throw of the Dice, the simple events are {1,2,3,4,5,6}, each with probability 1/6 P(Even) = P(2), P(4), P(6) = 1/6 * 3 = 0.5 The sum of probabilities, for all elements a of A CSA Corpora and Statistical Methods

More rules… Since, for any compound event A: the probability of all events, P( Ω ) is: (this is the likelihood of “anything happening”, which is always 100% certain) CSA Corpora and Statistical Methods

Yet more rules… If A is any event, the probability that A does not occur is the probability of the complement of A: i.e. the likelihood that anything which is not in A happens. Impossible events are those which are not in Ω. They have probability of 0. For any event A: CSA Corpora and Statistical Methods

Probability trees (I) CSA Corpora and Statistical Methods Here’s an even more complicated example: You flip a coin twice. Possible outcomes (order irrelevant): 2 heads (HH) 1 head, 1 tail (HT) 2 tails (TT) Are they equally likely? No! Only one way to obtain this: both throws give H Two different ways to obtain this: {throw1=H, throw2=T} OR {throw1=T, throw2=H} Only one way to obtain this: both throws give T

Probability trees (II) Four equally likely outcomes: HHH H THT HTH T TTT CSA Corpora and Statistical Methods Flip 2 Flip 1 outcome 0.5

So the answer to our problem CSA Corpora and Statistical Methods There are actually 4 equally likely outcomes when you flip a coin twice. HH, HT, TH, TT What’s the probability of getting 2 heads? P(HH) = ¼ = 0.25 What’s the probability of getting head and tail? P(HT OR TH) = 2/4 = 0.5

Probability trees (III) CSA Corpora and Statistical Methods Useful to picture the order in which different possible outcomes occur. Have an application in machine learning (called decision trees): each node represents a “decision” the edge leading to the node represents the probability, given the previous node.

The stability of the relative frequency

Teaser: violations of Laplace’s principle CSA Corpora and Statistical Methods You randomly pick out a word from a corpus containing 1000 words of English text. Are the following equally likely: word will contain the letter e word will contain the letter h What about: word will be audacity word will be the In both cases, prior knowledge or “experience” gives good reason for assuming unequal likelihood of outcomes. E is the most frequent letter in Engish orthography The is far more frequent than audacity

Unequal likelihoods CSA Corpora and Statistical Methods When the Laplace Principle is violated, how do we estimate probability? We often need to rely on prior experience. Example: In a big corpus, count the frequency of e and h Take a big corpus, count the frequency of audacity vs. the Use these estimates to predict the probability on a new 1000-word sample.

Example continued CSA Corpora and Statistical Methods Suppose that, in a corpus of 1 million words: C(the) = 50,000 C(audacity) = 2 Based on frequency, we estimate probability of each outcome of interest: frequency / total P(the) = 50,000/1,000,000 = 0.05 P(audacity) = 2/1,000,000 =

Long run frequency interpretation of probability CSA Corpora and Statistical Methods Given that a certain event of interest occurs m times in n identical situations, its probability is m/n. This is the core assumption in statistical NLP, where we estimate probabilities based on frequency in corpora. Stability of relative frequency: we tend to find that if n is large enough, the relative frequency of an event (m) is quite stable across samples In language, this may not be so straightforward: word frequency depends on text genre word frequencies tend to “flatten out” the larger your corpus (Zipf)

The addition rule

You flip 2 coins. What’s the probability that you get at least one head? The first intuition: P(H on first coin) + P(H on second coin) But: P(H) = 0.5 in each case, so the total P is 1. What’s wrong? We’re counting the probability of getting two heads twice! Possible outcomes: {HH, HT, TH, TT} The P(H) = 0.5 for the first coin includes the case where our outcome is HH. If we also assume P(H) = 0.5 for the second coin, this too includes the case where our outcome is HH. So, we count HH twice.

Venn diagram representation Set A represents outcomes where first coin = H. Set B represents outcomes where second coin = H A and B are our outcomes of interest. (TT is not in these sets) HT HHTH TT A B A and B have a nonempty intersection, i.e. there is an event which is common to both. Both contain two outcomes, but the total unique outcomes is not 4, but 3.

Some notation HT HHTH TT A B = events in A and events in B = events which are in both A and B = probability that something which is either in A OR B occurs = probability that something which is in both A AND B occurs

Addition rule To estimate probability of A OR B happening, we need to remove the probability of A AND B happening, to avoid double-counting events. In our case: P(A) = 2/4 P(B) = 2/4 P(A AND B) = ¼ P(A OR B) = 2/4 + 2/4 – ¼ = ¾ = 0.75

Subjective (Bayesian) probability

Bayesian probability CSA Corpora and Statistical Methods We won’t cover this in huge detail for now (more on Bayes in the next lecture) Bayes was concerned with events or predictions for which a frequency calculation can’t be obtained. Subjective rather than objective probability (Actually, what Bayes wanted to do was calculate the probability that god exists)

Example: the past doesn’t guarantee the future CSA Corpora and Statistical Methods The stock market The price of stocks/shares is extremely unpredictable Can’t usually predict tomorrow’s price based on past experience Too many factors influence it (consumer trust, political climate…) What is the probability that your shares will go up tomorrow?

Example (cont/d.) CSA Corpora and Statistical Methods In the absence of a rigorous way to estimate the probability of something, you need to rely on your beliefs Hopefully, your beliefs are rational Is the chance of your shares going up greater than the chance of: pulling out a red card from a deck containing 5 red and 5 black cards? P(shares up) > 0.5 pulling out a red card from a deck containing 9 red cards and one black? P(shares up) > 0.9 etc..

Bayesian subjective probability CSA Corpora and Statistical Methods An event has a subjective probability m/n of occurring if: you view it as equally likely to happen as pulling a red card from a deck of n cards in which m of the cards are red Does this sound bizarre? Objective vs. subjective probability is a topic of some controversy…