Probability Theory.

Slides:



Advertisements
Similar presentations
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
Advertisements

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Some Basic Concepts Schaum's Outline of Elements of Statistics I: Descriptive Statistics & Probability Chuck Tappert and Allen Stix School of Computer.
BCOR 1020 Business Statistics Lecture 15 – March 6, 2008.
G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Statistical Data Analysis: Lecture 2 1Probability, Bayes’ theorem 2Random variables and.
Programme in Statistics (Courses and Contents). Elementary Probability and Statistics (I) 3(2+1)Stat. 101 College of Science, Computer Science, Education.
Sections 4.1, 4.2, 4.3 Important Definitions in the Text:
Lecture 2 Probability and what it has to do with data analysis.
Lecture 4 Probability and what it has to do with data analysis.
Chapter 4: Joint and Conditional Distributions
QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik.
CEEN-2131 Business Statistics: A Decision-Making Approach CEEN-2130/31/32 Using Probability and Probability Distributions.
Lecture II-2: Probability Review
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Engineering Probability and Statistics
Recitation 1 Probability Review
Machine Learning Queens College Lecture 3: Probability and Statistics.
1 Probability and Statistics  What is probability?  What is statistics?
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Engineering Probability and Statistics Dr. Leonore Findsen Department of Statistics.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 4-1 Chapter 4 Basic Probability Business Statistics: A First Course 5 th Edition.
Chap 4-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 4 Using Probability and Probability.
Uncertainty Uncertain Knowledge Probability Review Bayes’ Theorem Summary.
STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function.
Determination of Sample Size: A Review of Statistical Theory
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Engineering Probability and Statistics Dr. Leonore Findsen Department of Statistics.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
1 Probability: Introduction Definitions,Definitions, Laws of ProbabilityLaws of Probability Random VariablesRandom Variables DistributionsDistributions.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 4-1 Chapter 4 Basic Probability Business Statistics: A First Course 5 th Edition.
1 Chapter 4 Mathematical Expectation  4.1 Mean of Random Variables  4.2 Variance and Covariance  4.3 Means and Variances of Linear Combinations of Random.
Chap 4-1 Chapter 4 Using Probability and Probability Distributions.
8.7 – Probability. Probability Probability = the likelihood that an event will occur Outcomes = possible results of an event Probability formula: P(event)
Lesson 99 - Continuous Random Variables HL Math - Santowski.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Theoretical distributions: the Normal distribution.
Comparing Systems Using Sample Data
Review of Probability.
Ch 9 實習.
Probability and Statistics
Introduction to Probability Distributions
Oliver Schulte Machine Learning 726
Probability 9/22.
Chapter 4 Using Probability and Probability Distributions
Probability and Estimation
STAT 311 REVIEW (Quick & Dirty)
Graduate School of Information Sciences, Tohoku University
Review Data: {2, 5, 6, 8, 5, 6, 4, 3, 2, 1, 4, 9} What is F(5)? 2 4 6
Central Tendency and Variability
Probability and Estimation
Bayesian Inference, Basics
How accurately can you (1) predict Y from X, and (2) predict X from Y?
Welcome to the wonderful world of Probability
Continuous Random Variables
Lecture 1 Cameron Kaplan
1. Homework #2 (not on posted slides) 2. Inferential Statistics 3
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
ASV Chapters 1 - Sample Spaces and Probabilities
Data Analysis Statistical Measures Industrial Engineering
Statistical analysis and its application
Sampling Distributions (§ )
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Ch 9 實習.
Applied Statistics and Probability for Engineers
Presentation transcript:

Probability Theory

Elements & Axioms

Probability Space of Two Die Sample Space (Ω) σ-Algebra (ℱ) Probability Measure Function (P) [...] E5={(1,4),(2,3),(3,2),(4,1)} P E5 0.11

Probability Space (Ω, ℱ, P) Probability Theory Probability Space (Ω, ℱ, P) Probability Axioms Sample Space (Ω) 1. Non-negativity (1,1) (1,2) [...] σ-Algebra (ℱ) 2. Unit Measure (i.e., unitarity) E5={(1,4),(2,3),(3,2),(4,1)} [...] Probability Measure Function (P) 3. σ-Additivity E5 P 0.11 E.g.: P(3dots or 4dots) = P(3dots) + P(4dots) = ⅙ + ⅙ = ⅓

Exercise 4.3) Probability Example In a pinochle deck, there are 48 cards: 6 values (9, 10, Jack, Queen, King, Ace) x 4 suits x 2 copies = 48 What is the probability of drawing a 10? What is the probability of drawing { 10 or Jack }? Recall: σ-Additivity

Probability Space vs Other Spaces If you remove the unitarity axiom, probability space is a measure space. If you remove the measure function, you are left with a topological space. In fact, probability space is just a specific resident of the “space of mathematical spaces”. Why don’t we use e.g., Banach spaces instead? Prob space (Unit Measure) Elements Axioms 1. Sample Space (Ω) 1. Non-negativity 2. σ-Algebra (ℱ) 2. Unitarity 3. Probability Function (P) 3. σ-Additivity

Plausibility Inference or Frequency Analysis? Bayes Probability Theory Frequentist Perspective Requirements For A Frequency Analysis System Kolmogorov Axioms/Theorems Cox’s Theorem Requirements For A Plausibility Inference System Bayesian Perspective Plausibility Axioms

Frequentism vs Bayesianism

Externalism: Probability as Frequency

Internalism: Probability as Degree of Belief As we saw, calibration can improve credibility estimates in the long term. Simulated betting is a way to elicit (materialize) your subjective credibilities. Proposition X ≝ “A snowstorm will close highway near Indianapolis on Christmas” Decision 1 Gamble A: You get $100 if X is true Gamble B: You get $100 if you draw red from a bag with { 5 red, 5 white } marbles. Decision 2 Gamble A: You get $100 if X is true Gamble B: You get $100 if you draw red from a bag with { 1 red, 9 white } marbles. Suppose you prefer Gamble B. This means your subjective P(X) < 0.5. Suppose you pick Gamble A. This means your subjective P(X) > 0.1.

Probability Distributions

Recall: Two Kinds of Distribution PMF PDF Probability Mass Function (PMF) Probability Density Function (PDF) 1/6 1 2 3 4 5 6 Snowfall (inches) DiceRoll DiceRoll has discrete domain: { 1, 2, 3, 4, 5, 6 }. Unitarity means: ∑Xe = 1 It is also true that: ∀Xe, Prob(Xe) < 1 Snowfall has continuous domain [0, ∞) Unitarity means: ∫ p(x) dx = 1 It is not true that: ∀dx, p(x) < 1

Continuous Bins Bin Size = 2in Bin Size = 1in

Example of p(x) > 1.0 As bin size becomes infinitesimally narrow, Prob(X) approaches zero. But the ratio of probability mass to interval width is meaningful to talk about. Let p(x) ≝ Prob(X) / dx A milligram of metal lead has a density of ~ 11 grams/cm3 This is possible because a milligram of lead takes up 0.000088 cm3 of space. In the same way, if probability mass is compressed into a very small area, p(x) can exceed 1.0, without violating unitarity.

Unitarity: Discrete vs Continuous We can algebraically manipulate the discrete unitarity formula, and arrive at the continuous unitarity formula. ∑ Prob([xi, xi + Δx]) = 1 Multiplying Δx / Δx doesn’t change the formulae. As Δx → 0, we rename each term. ∑ Δx * Prob([xi, xi + Δx]) / Δx = 1 Note: p(x) ≠ Prob(x) ∫ dx p(x) This is how we move PMF → PDF ∑Prob(Xe) = 1 → ∫ dx p(x) = 1

Density for normal distributions

Descriptive Statistics Central Tendency: mean, median, mode, etc. Uncertainty: stdev, etc Question: what is the relation between μ and E[x]? Expectation Operator is, Variance is, Suppose I asked you to compute min(varx). What is the solution? E[x]. In this sense, mean pairs with stdev. If we were trying to minimize (x - M), the median would minimize the expected distance.

Exercise 4.4 Example Let p(x) = 6x*(1-x) for x ∈ [0,1] Let’s run through an example probability density, and calculate E[x] Recall, Let’s check our work...

High Density Interval Another way to summarize a distribution, will be to use High Density Intervals (HDIs). We will use HDIs most often. Unitarity: For all x, ∫ dx p(x) = 1.00 HDI: Range(s) of x, ∫ dx p(x) = 0.95 Example Distributions: 1. Normal 2. Skewed 3. Bimodal

Two-Way Distributions

Joint & Marginal Probabilities Consider two discrete random variables: hair and eye color. We can distribute probabilities across multiple variables simultaneously. Each cell in this table (e.g., Prob(Black Hair, Green Eyes)) is a joint probability. If we collapse a dimension (e.g., row totals), we have marginal probabilities.

Conditional Probabilities To condition on blue eyes, you simply filter out other outcomes. Prob(h|blue) is pronounced “hair color given blue eyes” Filtering violates unitarity. After you condition on other outcomes, renormalize

Conditional Probabilities: Formal Definition Conditionals use normalization. Each cell here is: p(h|blue) = p(blue, h) / p(blue) This normalization process generalizes. Conditional probabilities can be defined as: Next week, we will use this definition to derive Bayes Theorem.

Exercise 4.1) Conditional Probabilities in R Let’s run through this scenario in R.