Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Chapter 6 Confidence Intervals.
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Lesson 10: Linear Regression and Correlation
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
The Simple Linear Regression Model: Specification and Estimation
QUANTITATIVE DATA ANALYSIS
9. SIMPLE LINEAR REGESSION AND CORRELATION
Probability Densities
Lesson Fourteen Interpreting Scores. Contents Five Questions about Test Scores 1. The general pattern of the set of scores  How do scores run or what.
Chapter 6 Continuous Random Variables and Probability Distributions
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
SIMPLE LINEAR REGRESSION
Analysis of Research Data
Review of Basic Statistics. Parameters and Statistics Parameters are characteristics of populations, and are knowable only by taking a census. Statistics.
Continuous Random Variables and Probability Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Chap 6-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 6 Continuous Random Variables and Probability Distributions Statistics.
SIMPLE LINEAR REGRESSION
Review of Probability and Statistics
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Today: Central Tendency & Dispersion
SIMPLE LINEAR REGRESSION
Correlation.
Chapter 3 – Descriptive Statistics
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Odds. 1. The odds in favor of an event E occurring is the ratio: p(E) / p(E C ) ; provided p(E C ) in not 0 Notations: The odds is, often, expressed in.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Statistics 1 Measures of central tendency and measures of spread.
Statistics: For what, for who? Basics: Mean, Median, Mode.
QBM117 Business Statistics Descriptive Statistics Numerical Descriptive Measures.
Variance and Covariance
Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.
Applied Quantitative Analysis and Practices LECTURE#11 By Dr. Osman Sadiq Paracha.
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Descriptive Statistics, The Normal Distribution, and Standardization.
Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Determination of Sample Size: A Review of Statistical Theory
To be given to you next time: Short Project, What do students drive? AP Problems.
Chapter 9 Statistics.
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 6-1 The Normal Distribution.
Review of Probability Concepts Prepared by Vera Tabakova, East Carolina University.
Continuous Random Variables and Probability Distributions
Describing Samples Based on Chapter 3 of Gotelli & Ellison (2004) and Chapter 4 of D. Heath (1995). An Introduction to Experimental Design and Statistics.
CHAPTER 2: Basic Summary Statistics
CHAPTER – 1 UNCERTAINTIES IN MEASUREMENTS. 1.3 PARENT AND SAMPLE DISTRIBUTIONS  If we make a measurement x i in of a quantity x, we expect our observation.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
THE NORMAL DISTRIBUTION
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Biostatistics Class 3 Probability Distributions 2/15/2000.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Measures of Central Tendency
Review of Probability Concepts
SIMPLE LINEAR REGRESSION
CHAPTER 2: Basic Summary Statistics
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Advanced Algebra Unit 1 Vocabulary
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across individuals, families, nations, months, quarters, etc. A constant, on the other hand, does not vary--e.g., the number of heads on a person. àA random variable is a type of variable which has its value determined at least in part by the element of chance

Measures of Central Tendency àThe mode, median, and mean are measures of the central tendency of a random variable such as the height of males. If the statement is made with respect to this variable that “the mode is 5'10",” it means that most common height (or the height which occurs with the greatest frequency) among males is 5'10". àThe median is the value of the random variable such that half the observations are above it and half below it. To say that “median family income in the U.S. is $38,450" is to say that half of U.S. households have an income below that figure and half above it.

à The population mean (symbolized by the Greek letter µ) is the average value of the variable for the population. Let m denote the number of observations (corresponding to the size of the population). Thus, we have: àSuppose we want to know the average height of adult males in the U.S. The practical approach would be to measure a representative sample (meaning, for example, that basketball players would not be disproportionately represented in the sample) of the population rather than the entire population. That is, we estimate the population mean by calculating a sample mean (  ). Let n be the number of observations in our sample. Thus we have:

Measures of Dispersion àOften we are interested in looking at the degree of dispersion of a random variable about its mean value. That is, are our observations of adult male height all bunched up around the mean or do we have wide dispersion about the mean? The population variance (  2 ) is a measure of the dispersion of a random variable. The variance of random variable X is defined as:

àIf we observe only a representative sample of the population, then : (1) µ is unknown; and (2) all the X i ’ s are not known. Thus, we estimate  2 by substituting  for µ and summing across our sample observations of X This is called a sample variance (s 2 ): àNote that we must divide through by n - 1 to obtain an unbaised estimate of  2 --that is s 2 is an unbaised estimator of  2 if E(s 2 ) =  2 àThe population standard deviation (  ) is given by the square root of the population variance ( 2 ). You can think of as the “average deviation from the mean.” In the case of male adult height, one would like to see that measure expressed in inches--hence we take the square root of the variance. àSimilarly, the sample standard deviation (s) is given by the square root of the sample variance (s 2 ).

Probability Distributions àThe probability density function of variable X is constructed such that, for any interval (a, b), the probability that X takes on a value in that interval is the total area under the curve between a and b. Expressed in terms of integral calculus, we have:

ab X P(X) Area under curve represents probability You should be familiar with this diagram

àThe normal distribution is probability density function which is symmetric about the mean--i.e., the left-hand side of the distribution is a mirror image of the right-hand side. The formula for the normal probability density function is given by:

The normal distribution   22 -2  -  68.27% 95.45%

àA random variable Z is said to be standard normal if it is normally distributed with mean of zero or and a variance of 1. If X is normally distributed with mean µ and variance  2, we abbreviate with the expression: X ~ N( ,  2 ) àThus, the expression used to indicate that the distribution of Z is standard normal is: Z ~ N(0, 1)

The standard normal distribution a0 P(Z) Pr(Z > a) when Z ~ N(0, 1) àFor example: àIf a = 1.93, then Pr(Z  a ) = àAnd Pr(Z  a ) =

Correlation of Random Variables  To say that random variables X and Y are correlated is to say that changes in X are associated with changes in Y in the probabilistic or statistical sense. However, this does not necessarily mean that a change in X was the cause of a change in Y, or vice-versa. That is, “correlation does not imply causality.” àTechnically speaking, the statement “X and Y are positively correlated” means that the covariance between random variables X and Y is positive (or greater than zero).

E(Y) E(X) 0 Y X X and Y are positively correlated random variables 1, X > E(X) and Y > E(Y) 2, X E(Y) 3, X E(X) and Y < E(Y)

àThe sample covariance between X and Y (i.e., our estimate of the covariance when we do not observe the entire populations of X’s or Y’s) is given by the following formula (the “hat” indicates an estimate): àThe covariance is positive if above average values of X tend to be paired with above average values of Y, and vice versa. The covariance is negative (and hence the variables are negatively correlated) if below average values of X tend to be paired with above average values of Y, and vice-versa. The magnitude of the covariance depends partly on the unit of measurement. Hence, we cannot depend on the size of the covariance to give an accurate measure of the strength of the relationship

àThe correlation coefficient (  ) is a unit-free measure of correlation. The sample correlation coefficient is given by: àIt will always be the case that: -1    1. àIf  = 1, there is a perfect positive ( linear) correlation between X and Y. If  = -1, there is a perfect negative (linear) correlate between X and Y.