STATISTICS Random Variables and Distribution Functions

Slides:



Advertisements
Similar presentations
Random Processes Introduction (2)
Advertisements

STATISTICS Sampling and Sampling Distributions
STATISTICS Random Variables and Probability Distributions
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Hypotheses Test.
STATISTICS Univariate Distributions
STATISTICS Joint and Conditional Distributions
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Hypotheses Test.
R_SimuSTAT_2 Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
STATISTICS Random Variables and Distribution Functions
Chapter 3 Properties of Random Variables
Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Random Variables ECE460 Spring, 2012.
Random Variable A random variable X is a function that assign a real number, X(ζ), to each outcome ζ in the sample space of a random experiment. Domain.
FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.
Review of Basic Probability and Statistics
Probability and Statistics Review
CHAPTER 6 Statistical Analysis of Experimental Data
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
STATISTICS HYPOTHESES TEST (I) Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables.
Theory of Probability Statistics for Business and Economics.
1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.
Continuous Distributions The Uniform distribution from a to b.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Interval Estimation.
One Random Variable Random Process.
Lecture V Probability theory. Lecture questions Classical definition of probability Frequency probability Discrete variable and probability distribution.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
STATISTICS Univariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS Joint and Conditional Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
STOCHASTIC HYDROLOGY Stochastic Simulation of Bivariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University 1/45 GEOSTATISTICS INTRODUCTION.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
1 Probability: Introduction Definitions,Definitions, Laws of ProbabilityLaws of Probability Random VariablesRandom Variables DistributionsDistributions.
Stochastic Hydrology Random Field Simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.
Random Variables By: 1.
STATISTICS Exploratory Data Analysis and Probability
STATISTICS HYPOTHESES TEST (I)
Discrete Random Variables and Probability Distributions
STATISTICS POINT ESTIMATION
STATISTICS Joint and Conditional Distributions
Cumulative distribution functions and expected values
Chapter 2 Discrete Random Variables
STATISTICS Univariate Distributions
REMOTE SENSING Multispectral Image Classification
REMOTE SENSING Multispectral Image Classification
MEGN 537 – Probabilistic Biomechanics Ch.3 – Quantifying Uncertainty
Stochastic Hydrology Random Field Simulation
STATISTICS INTERVAL ESTIMATION
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
STOCHASTIC HYDROLOGY Random Processes
STATISTICS Exploratory Data Analysis and Probability
M248: Analyzing data Block A UNIT A3 Modeling Variation.
Chapter 2. Random Variables
Further Topics on Random Variables: 1
Experiments, Outcomes, Events and Random Variables: A Revisit
Stochastic Simulation and Frequency Analysis of the Concurrent Occurrences of Multi-site Extreme Rainfalls Prof. Ke-Sheng Cheng Department of Bioenvironmental.
Professor Ke-sheng Cheng
Continuous Distributions
STATISTICS HYPOTHESES TEST (I)
STATISTICS Univariate Distributions
Continuous Random Variables: Basics
Presentation transcript:

STATISTICS Random Variables and Distribution Functions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

Definition of random variable (RV) For a given probability space ( ,A, P[]), a random variable, denoted by X or X(), is a function with domain  and counterdomain the real line. The function X() must be such that the set Ar, denoted by , belongs to A for every real number r. Unlike the probability which is defined on the event space, a random variable is defined on the sample space. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

is defined whereas is not defined. Random experiment Sample space Event space Probability space is defined whereas is not defined. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Cumulative distribution function (CDF) The cumulative distribution function of a random variable X, denoted by , is defined to be 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Consider the experiment of tossing two fair coins Consider the experiment of tossing two fair coins. Let random variable X denote the number of heads. CDF of X is 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Indicator function or indicator variable Let  be any space with points  and A any subset of . The indicator function of A, denoted by , is the function with domain  and counterdomain equal to the set consisting of the two real numbers 0 and 1 defined by 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Discrete random variables A random variable X will be defined to be discrete if the range of X is countable. If X is a discrete random variable with values then the function denoted by and defined by is defined to be the discrete density function or probability mass function (or simply probability function) of X. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Continuous random variables A random variable X will be defined to be continuous if there exists a function such that for every real number x. The function is called the probability density function of X. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Properties of a CDF is continuous from the right, i.e. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Properties of a PDF For a continuous random variable X, Two continuous random variables having the same CDF can have different PDFs. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Equality of random variables Two random variables (X and Y) can have the same cumulative distribution function without being the same random variable. The two random variables are equal everywhere. The following conditions does not imply X and Y being equal 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Example 1 Determine which of the following are valid distribution functions: 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Example 2 Determine the real constant a, for arbitrary real constants m and 0 < b, such that is a valid density function. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Function is symmetric about m. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Mixed distribution Not all random variables are either continuous or discrete. A distribution that is a mixture of a discrete distribution and a continuous distribution is called a mixed distribution. Let X be the random variable that represents the time delay that a motorist needs to wait after making a required stop at a traffic stop sign. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Censored and Truncated Data Y is censored when we observe X for all observations, but we only know the true value of Y for a restricted range of observations. Values of Y in a certain range are reported as a single value or there is significant clustering around a value, say 0. Y is censored from below or left-censored if all lower-than-k values of Y are recorded as k. Y is censored from above or right-censored if all higher-than-k values of Y are recorded as k. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

A truncated distribution (Y) is a conditional distribution that results from restricting the domain of some other probability distribution (X). 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Mixture Distribution A mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Characterizing random variables Cumulative distribution function Probability density function Expectation (expected value) Variance Moments Quantile Median Mode 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Expectation of a random variable The expectation (or mean, expected value) of X, denoted by or E(X) , is defined by: 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Rules for expectation Let X and Xi be random variables and c be any real constant. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Variance of a random variable 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

is called the standard deviation of X. Variance characterizes the dispersion of data with respect to the mean. Thus, shifting a density function does not change its variance. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Rules for variance 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

If two random variables X and Y are independent, then E(XY)=E(X)E(Y) Two random variables are said to be independent if knowledge of the value assumed by one gives no clue to the value assumed by the other. If two random variables X and Y are independent, then E(XY)=E(X)E(Y) 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Moments and central moments of a random variable 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Properties of moments 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Standardization, or normalization, does not change the skewness and kurtosis. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Quantile The qth quantile of a random variable X, denoted by , is defined as the smallest number satisfying . Discrete Uniform 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Median and mode The median of a random variable is the 0.5th quantile, or . The mode of a random variable X is defined as the value u at which is the maximum of . 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Note: For a positively skewed distribution, the mean will always be the highest estimate of central tendency and the mode will always be the lowest estimate of central tendency (assuming that the distribution has only one mode). For negatively skewed distributions, the mean will always be the lowest estimate of central tendency and the mode will be the highest estimate of central tendency. In any skewed distribution (i.e., positive or negative) the median will always fall in-between the mean and the mode. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Moment generating function 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Usage of MGF MGF can be used to express moments in terms of PDF parameters and such expressions can again be used to express mean, variance, coefficient of skewness, etc. in terms of PDF parameters. Random variables of the same MGF are associated with the same type of probability distribution. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

The moment generating function of a sum of independent random variables is the product of the moment generating functions of individual random variables. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Expected value of a function of a random variable 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

If Y=g(X) 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Y Y=g(X) y X x1 x2 x3 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Theorem 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Chebyshev Inequality 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

The Chebyshev inequality gives a bound, which does not depend on the distribution of X, for the probability of particular events described in terms of a random variable and its mean and variance. 7/23/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.