Probability and Statistics

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Introduction to Data Analysis
Random Variables and Expectation. Random Variables A random variable X is a mapping from a sample space S to a target set T, usually N or R. Example:
Sections 4.1 and 4.2 Overview Random Variables. PROBABILITY DISTRIBUTIONS This chapter will deal with the construction of probability distributions by.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
1 Def: Let and be random variables of the discrete type with the joint p.m.f. on the space S. (1) is called the mean of (2) is called the variance of (3)
QUANTITATIVE DATA ANALYSIS
Variability 2011, 10, 4. Learning Topics  Variability of a distribution: The extent to which values vary –Range –Variance** –Standard Deviation**
Descriptive Statistics Statistical Notation Measures of Central Tendency Measures of Variability Estimating Population Values.
Chapter 6 Continuous Random Variables and Probability Distributions
Continuous Random Variables and Probability Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Chap 6-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 6 Continuous Random Variables and Probability Distributions Statistics.
Review of Probability and Statistics
Central Tendency and Variability
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Lecture 28 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Measures of Central Tendency
Review of Probability.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Concepts and Notions for Econometrics Probability and Statistics.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Investment Analysis and Portfolio management Lecture: 24 Course Code: MBF702.
1 Managerial Finance Professor Andrew Hall Statistics In Finance Probability.
PBG 650 Advanced Plant Breeding
Odds. 1. The odds in favor of an event E occurring is the ratio: p(E) / p(E C ) ; provided p(E C ) in not 0 Notations: The odds is, often, expressed in.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Statistics Class 4 February 11th , 2012.
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Chapter 3 Descriptive Measures
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Two Random Variables W&W, Chapter 5. Joint Distributions So far we have been talking about the probability of a single variable, or a variable conditional.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
When trying to explain some of the patterns you have observed in your species and community data, it sometimes helps to have a look at relationships between.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across.
Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.
The Mean of a Discrete RV The mean of a RV is the average value the RV takes over the long-run. –The mean of a RV is analogous to the mean of a large population.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 5.2: Recap on Probability Theory Jürgen Sturm Technische Universität.
1 G Lect 2M Examples of Correlation Random variables and manipulated variables Thinking about joint distributions Thinking about marginal distributions:
Variability Pick up little assignments from Wed. class.
The two way frequency table The  2 statistic Techniques for examining dependence amongst two categorical variables.
TYPES OF DATA KEEP THE ACTIVITIES ROLLING Data, Standard Deviation, Statistical Significance.
Chapter 9 Day 1. Parameter and Statistic  Parameter – a number that describes a population, usually impossible to find  Statistic – A number described.
Sociology 5811: Lecture 3: Measures of Central Tendency and Dispersion Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Statistics What is statistics? Where are statistics used?
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Review of Probability Concepts Prepared by Vera Tabakova, East Carolina University.
Continuous Random Variables and Probability Distributions
CHAPTER 2: Basic Summary Statistics
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
7.2 Means & Variances of Random Variables AP Statistics.
Random Variables 2.1 Discrete Random Variables 2.2 The Expectation of a Random Variable 2.3 The Variance of a Random Variable 2.4 Jointly Distributed Random.
Economics 111Lecture 7.2 Quantitative Analysis of Data.
Heights  Put your height in inches on the front board.  We will randomly choose 5 students at a time to look at the average of the heights in this class.
Expected Return and Risk. Explain how expected return and risk for securities are determined. Explain how expected return and risk for portfolios are.
Biostatistics Class 3 Probability Distributions 2/15/2000.
Goals of Statistics.
Central Tendency and Variability
Keller: Stats for Mgmt & Econ, 7th Ed
Variability.
Probability Review for Financial Engineers
Introduction to Probability & Statistics The Central Limit Theorem
Probability Key Questions
AP Statistics Chapter 16 Notes.
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

Probability and Statistics

Probability vs. Statistics In probability, we build up from the mathematics of permutations and combinations and set theory a mathematical theory of how outcomes of an experiment will be distributed In statistics we go in the opposite direction: We start from actual data, and measure it to determine what mathematical model it fits Statistical measures are estimates of underlying random variables

Basic Statistics Let X be a finite sequence of numbers (data values) x1, …, xn E.g. X = 1, 3, 2, 4, 1, 4, 1 (n=7) Order doesn’t matter but we need a way of allowing duplicates (“multiset”) Some measures: Maximum: 4 Minimum: 1 Median (as many ≥ as ≤): 1, 1, 1, 2, 3, 4, 4 so median = 2 Mode (maximum frequency): 1

Sample Mean Let X be a finite sequence of numbers x1, …, xn. The sample mean of X is what we usually call the average: For example, X = 1, 3, 2, 4, 1, 4, 1 , then μX=16/7 Note that the mean need not be one of the data values. We might as well write this as E[X] following the notation used for random variables

Sample Variance The sample variance of a sequence of data points is the mean of the square of the difference from the sample mean:

Standard Deviation Standard deviation is the square root of the variance: σ is a measure of spread in the same units as the data

σ Measures “Spread” X = 1, 2, 3 μ=2 σ2 = (1/3) ∙ ((1-2)2+(2-2)2+(3-2)2) = 2/3 σ ≈ .82 Y = 1, 2, 3, 4, 5 μ=3 σ2 = (1/5) ∙ ((1-3)2+(2-3)2+(3-3)2+(4-3)2+(5-3)2) = 10/5 σ ≈ 1.4

Small σ Indicates “Centeredness” X = 1, 2, 3 σ ≈ .82 Z = 1, 2, 2, 2, 3 μ=2 σ2 = (1/5) ∙ ((1-2)2+3∙(2-2)2+(3-2)2) = 2/5 σ ≈ .63 W = 1, 1, 2, 3, 3 σ2 = (1/5) ∙ (2∙(1-2)2+(2-2)2+2∙(3-2)2) = 4/5 σ ≈ .89 (“Bimodal”)

Covariance Sometimes two quantities tend to vary in the same way, even though neither is exactly a function of the other For example, height and weight of people Height Weight

Covariance for Random Variables Roll two dice. Let X = larger of the two values, Y = sum of the two values Mean of X = (1/36) × (1×1 [only possibility is (1,1)] +3×2 [(1,2), (2,1), (2,2)] +5×3 [(1,3), (2,3), (3,3), (3,2), (3,1)] +7×4 + 9×5 + 11×6) = 4.47 Mean of Y = 7 How do we say that X tends to be large when Y is large and vice versa?

Joint Probability f(x,y) = Pr(X=x and Y=y) is a probability Sums to 1 over all possible x and y Pr(X=1 and Y=12) = 0 Pr(X=5 and Y=9) = 2/36 Pr(X≤5 and Y≥8) = 4/36 [(4,4), (4,5), (5,4), (5,5)]

Covariance of Random Variables Cov(X, Y) = E[ (X − μX) ∙ (Y −μY) ] INSIDE the brackets, each of X and Y is compared to its own mean The OUTER expectation is with respect to the joint probability that X=x AND Y=y Positive if X tends to be greater than its mean when Y is greater than its mean Negative if X tends to be greater than its mean when Y is less than its mean But what are the units?

Sample Covariance Suppose we just have the data x1, …, xN and y1, …, yN and we want to know the extent to which these two sets of values covary (eg height and weight). The sample covariance is An estimate of the covariance

A Better Measure: Correlation Correlation is Covariance scaled to [-1,1] This is a unitless number! If X and Y vary in the same direction then correlation is close to +1 If they vary inversely then correlation is close to -1 If neither depends on the mean of the other than the correlation is close to 0

Positively Correlated Data

Correlation Examples http://upload.wikimedia.org/wikipedia/commons/d/d4/Correlation_examples2.svg

FINIS