Presentation is loading. Please wait.

Presentation is loading. Please wait.

METU, GGIT 538 CHAPTER II REVIEW OF BASIC STATISTICAL CONCEPTS.

Similar presentations


Presentation on theme: "METU, GGIT 538 CHAPTER II REVIEW OF BASIC STATISTICAL CONCEPTS."— Presentation transcript:

1 METU, GGIT 538 CHAPTER II REVIEW OF BASIC STATISTICAL CONCEPTS

2 METU, GGIT 538 OUTLINE (Last Week) Introduction to Spatial Data Analysis 1.Introduction 1.1. Introduction 1.1.1. Scope of spatial statistics 1.2. Spatial versus non-spatial data analysis 1.2.1. Relatiaonship between classes of spatial entities 1.2.1. Facts on attributes of spatial entities 1.3. Types of spatial phenomena and relationships 1.4. Problem types in spatial data analysis 1.4.1. Problems of spatially discrete point data 1.4.2. Problems of spatially continuous point data 1.4.3. Problems of area data 1.4.4. Problems of spatial interaction data

3 METU, GGIT 538 OUTLINE Review of Basic Statistical Concepts 2.1.Random Variables and Probability Distributions 2.1.1.The Binomial Distribution 2.1.2.The Poisson Distribution 2.1.3.The Normal Distribution 2.2. Expectation 2.3. Maximum Likelihood Estimation 2.4. Stationarity and Isotropy

4 METU, GGIT 538 2.1. Random Variables and Probability Distributions Basic Definitions Stochastic process: Phenomena, which are subject to uncertainty or governed by laws of probability. Random variable: They are the variables used to present stochastic phenomena mathematically and said to be variables whose values are subject to uncertainty. The term random implies that variables are equally likely to take any of their possible values, in other words they are subject to some form of uncertainty. Random variables  Upper case letters (e.g. X, Y) The specific value of a r.v.  Lowercase letters (e.g. x, y)

5 METU, GGIT 538 Basic Definitions A stochastic process is described by random variables and their probability distributions  not that are equally likely to take on any of all possible values  phenomena subject to uncertainty

6 METU, GGIT 538  E.g. Y: Result of throwing a die y : Particular value or outcome (1,2,3,4,5,6) after the die thrown. There are two types of random variables:  Discrete(Integer values, counts)  Continuous(Any value within a continuous range)

7 METU, GGIT 538 Probability distribution: It is a mathematical function which specifies the probability that particular values or ranges of values will occur; i.e. relative chance of a random variable taking different possible values are characterized by its probability distribution. Probability density at the value of y is f Y (y) (i.e. Probability that Y takes the value of y) Probability that a random variable Y takes values between (a, b) is expressed by:  Discrete  Continuous

8 METU, GGIT 538 Cumulative Probability Distribution (Distribution Function) It is a mathematical function specifying the probability that Y takes any value less than or equal to the specific value y (F Y (y)). if Y is discrete if Y is continuous

9 METU, GGIT 538

10 The joint probability distribution is used if more than one random variable is of concern (e.g., X and Y) and expressed by:  Probability associated with X taking the value of x and Y taking the value of y

11 METU, GGIT 538 Important Probability Distributions  Discrete  Binomial  Poisson  Geometric  Continuous  Normal  Lognormal  t  F   2  Exponential  Gamma

12 METU, GGIT 538 The three fundamental probability distributions are: 1.Binomial Distribution  Dichotomous events 2. Poisson Distribution  Discrete events 3. Normal Distribution  Continuous events

13 METU, GGIT 538 2.1.1.The Binomial Distribution It relates to the events in which there are only two possible outcomes. The nature of the distribution is typically explained with reference to events involving outcomes of two types (head or tail, yes or no, male or female, success and failure, etc.). It can be used to define probability of each outcome occurring.

14 METU, GGIT 538 If the probability of X successes of N trial is denoted by p(X), the binomial distribution is defined by: Where; :The combinatorial expression p:Probability of a “success” (occurrence) q:Probability of a “failure” (non-occurrence) {q = 1- p} X:Specified number of successes (occurrence) N:Number of trials (sequence length)

15 METU, GGIT 538 Gives the number of times a particular combinations of success (occurrences) will result.  E.g. How many different combinations are there of 3 successes in 5 trials? (N = 5 and X = 3) i.e. There are 10 permutations of 3 successes in a sequence of 5 trials.

16 METU, GGIT 538 The form and shape of the distribution depend on the length of the trials in a sequence (N) and the probability of an occurrence (p).  If p = q the binomial distribution is symmetric  If p > q the binomial distribution is right skewed  If p < q the binomial distribution is left skewed  If N is large the distribution is more peaked

17 METU, GGIT 538 Figure 2.1. Effect of N and p on probability distribution

18 METU, GGIT 538 The Parameters of Binomial Distribution Mean   = Np Variance   2 = Npq Standard Deviation   = Skewness   = p = q  β = 0 p<>0,5  β ↑

19 METU, GGIT 538 2.1.2.The Poisson Distribution It is used in respect of events which have discrete number of outcomes (i.e. counted integers). The probability of occurrence reduces as the integer count of the event increases. This distribution is particularly useful when the number of times a phenomenon occurs within a given unit of time or space. The distribution function of Poisson distribution is: Where; X :Specific number of events λ :Average number of events e = 2.7183…. p(X) = X e - X!

20 METU, GGIT 538  E.g. Consider the location of grocers in a town. # of grocers per grid determines the X and the average number of grocers per study area gives the λ. Figure 2.2. Location of grocers in Sunderland

21 METU, GGIT 538 Figure 2.3. Effect of the value of λ on the Poisson distribution

22 METU, GGIT 538 The Parameters of Poisson Distribution Mean   = Variance   2 = Standard Deviation   = Skewness   = 1 λ ↑  β → 0 (More symmetrical) λ ↓  β ↑ (More skewed)

23 METU, GGIT 538 Basic Shortcomings of the Poisson Distribution  λ has critical importance, its value is assumed to be constant over the considered time and space.  Frequency distribution can easily be altered by varying the units of space or time for which the events are recorded

24 METU, GGIT 538 2.1.3.The Normal Distribution It relates to situations where continuous rather than discrete outcomes are recorded for the observations under investigation. It is used when the phenomena are measured on the interval or ratio scales rather than counted. The probability distribution function of the normal curve is: Where; Y : The frequency or number of occurrences of the measurement X  : Standard deviation of all X’s  : Mean of X

25 METU, GGIT 538 The points of inflection on the curve, where it changes from convex to concave, occur at 1σ either side of the mean. (i.e. at σ - μ and σ + μ). The mean and standard deviation are of crucial importance in determining the position and spread of the normal curve. Figure 2.4. A typical normal probability curve

26 METU, GGIT 538 Figure 2.5. Normal curves with identical σ but different μ Figure 2.6. Normal curves with identical μ but different σ

27 METU, GGIT 538 The Standard Normal Curve: The form of the normal curve differs widely in response to the characteristics of the data set. This difficulty may be overcome by expressing the raw values as deviations from the mean (called standardization). The units of expression are proportions of the standard deviation and are termed z values. In this way observations numerically greater than the mean become positive z values and those less than the mean become negative z values. The z and the probability density function are expressed:

28 METU, GGIT 538 Important Properties of Normal Distribution  The probability of any particular value occurring is infinitesimally small and can be regarded as zero. One can only speak of the probability of values occurring within a given range, which is shown by the area under the curve.  A normal curve is symmetrical about the mean, half of the area beneath the curve, which is 1, lies on each side of the mean.  50 % of the values are less than the mean and the rest are more than the mean. i.e. probability of a value being less than the mean is equal to that being greater than the mean and are both equal to 0.5.  68 % of the area under a normal curve lies within 1 standard deviation of the mean, 95 % of the area and 99.7 % of that lies within 2 and 3 standard deviations of the mean, respectively.

29 METU, GGIT 538 Useful Websites for Review http://mathworld.wolfram.com/PoissonDistribut ion.html http://mathworld.wolfram.com/GaussianDistrib ution.html http://www.stats.gla.ac.uk/steps/glossary/proba bility_distributions.html#randvar

30 METU, GGIT 538 2.2. Expectation Expected value of a random variable Y refers to its “mean”. In some cases expected value of a function of Y (g(Y)) could be of interest. As its name implies, it is simply the average value that we expect Y or g(Y) to take.  Expected value of Y can also be defined as the weighted sum of the possible values, where the weights used for each value are the probability associated with that value.

31 METU, GGIT 538 Expectation represents the mean of a probability distribution If probabilities of obtaining the amounts a 1, a 2, a 3,….a n are p 1, p 2, p 3,… p n Then the mathematical expectation is E = a 1 p 1 +a 2 p 2 +...+a n p n E is a weighted average – where weights are the probability for the value

32 METU, GGIT 538

33 Probabilities for 0,1,2,3 heads are  E.g. Find the mean of the probability distribution of the number of heads obtained in three flips of a balanced coin.

34 METU, GGIT 538 The measure of how much the random variables’ values tend to vary around their mean is called variance. Covariance: Expected tendency for values of X to be similar to values of Y. VAR(Y) = E (Y-E(Y)) 2 S.D.(Y) =

35 METU, GGIT 538 Independence: Two variables are said to be independent if the probabilistic behavior of one remains the same no matter what values the other might take. In this case their joint probability distribution is:  COV (X,Y) = 0 and Cor.Coef. = 0

36 METU, GGIT 538 2.3. Maximum Likelihood Estimation There are different methods for estimating the parameters of stochastic phenomena:  Point Estimation methods  Methods of moments  Method of maximum likelihood estimation  Interval estimation methods Unlike method of moments, the maximum likelihood method provides a procedure for deriving the point estimator directly.

37 METU, GGIT 538 The aim of maximum likelihood estimation is to find the parameter value(s) that makes the observed data most likely. If the probability of a event X dependent on model parameters (θ ), θ is written P ( X | θ ) then we would talk about the likelihood L (θ | X ) that is, the likelihood of the parameters given the data.

38 METU, GGIT 538 Consider a random variable with a probability distribution function of f(x;θ), in which θ is the parameter, such as the mean λ in the exponential distribution. On the basis of sample values, x 1,…, x n, one can inquire: What is the most likely value of θ that produces the set of observations x 1,…, x n ? Or Among the possible values of θ, what is the value that will maximize the likelihood of obtaining the set of observations x 1,…, x n ?

39 METU, GGIT 538 The likelihood of obtaining a particular sample value x i can be assumed to be proportional to the value of the probability distribution function (probability density function) evaluated at x i. The likelihood of obtaining n independent observations x 1,…, x n is: The maximum likelihood estimator is the value of θ that maximizes the likelihood function L(x 1,…, x n ; θ). is obtained as the solution of the following:

40 METU, GGIT 538  E.g. Say we toss a coin 100 times and observe 56 heads and 44 tails. Instead of assuming that p is 0.5, we want to find the MLE for θ. Then we want to ask whether or not this value differs significantly from 0.50. How do we do this? We find the value for θ that makes the observed data most likely. The observed data are considered as fixed. They will be constants that are plugged into the binomial probability model : n = 100 (total number of tosses) h = 56 (total number of heads)

41 METU, GGIT 538 PL 0.480.0222 0.500.0389 0.520.0581 0.540.0739 0.560.0801 0.580.0738 0.600.0576 0.620.0378

42 METU, GGIT 538 It is frequently more convenient to maximize the logarithm of the likelihood function If the distribution function has two or more parameters:

43 METU, GGIT 538 Problems of MLE Method 1. The distribution of the population should be known 2. A solution to the likelihood equation(s) may not exist. 3. Even if a solution exits, it only gives a stationary point of the likelihood function. i.e. it is a local maximum or a local minimum or a saddle point. A check on the second derivatives is necessary to determine the nature of the stationary point. 4. The likelihood equations may have multiple solutions. In this case the individual solutions must be checked to determine which one yields the global maximum. It is possible to have more than one global maximum.

44 METU, GGIT 538 2.4. Stationarity and Isotropy (terminology) A spatial phenomenon is represented within a spatial domain (R) and the location of each stochastic phenomenon is expressed by s. The set of s within R referred as a spatial stochastic process, {Y(s), s Є R}. Any data location in R Location vector of point s Spatial Stochastic Process

45 METU, GGIT 538 2.4. Stationarity and Isotropy Figure 2.7. A spatial stochastic process Modeling real life problems requires data and assumptions on nature of the phenomena

46 METU, GGIT 538 Spatial stochastic processes often exhibit a degree of spatial correlation and this correlation by somehow has to be incorporated into the analysis. In general the behavior of spatial phenomena is the result of a mixture of two types of effects:  First order  Second order First order effects: They relate to variation in the mean value of the process in space (global or large-scale trend). Second order effects: They are resultant from the spatial correlation structure or the spatial dependence in the process. In other words, this effect occurs due to the tendency for deviations in values of the process from its mean to follow each other in neighboring sites (local or small-scale effects).

47 METU, GGIT 538 Behavior of Spatial Phenomena First Order effects Variation of a mean value in space - global or large scale trend Second order effects Correlation in the deviations of the process values from the mean

48 METU, GGIT 538  E.g. Suppose that iron particles onto a sheet of paper marked with a fine grid are scattered. The numbers of particles landing in different grid-squares represent a spatial stochastic process. As long as the mechanism by which we scatter iron particles is purely random, they should lack in both 1st and 2nd order effects (Case I). Figure 2.7. Random scatter of iron particles (Case I)

49 METU, GGIT 538 Suppose that a small number of weak magnets are placed under the paper at different points and we scatter the iron particles again. The result will be a process with spatial pattern arising from first-order effects (clustering in numbers in grid-squares will occur globally at and around the sites of magnets (Case II). Figure 2.8. Scatter of iron particles with magnets underneath the paper (Case I)

50 METU, GGIT 538 Now remove the magnets and weakly magnetized the iron particles instead and scatter them again. The result is a process with spatial pattern arising from a second-order effect (some degree of local clustering will occur due to the tendency of particles to attract or repel each other) (Case III). If the magnets are now replaced under the paper and the magnetized particles scattered again we end up with a spatial pattern arising from both first-order and second- order effects

51 METU, GGIT 538  E[Y(s)] and VAR[Y(s)] are constant over R and do not depend on locations.  COV[Y(s i ),Y(s j )] between values at any two sites s i and s j, depends only on the relative locations of these sites, the distance and direction between them. But not their absolute location in R.  If mean, variance and covariance structure changes over R, the process exhibits non-stationarity or heterogeneity. Stationarity: A spatial process is stationary or homogeneous if its statistical properties are independent of absolute location in R. This implies that:

52 METU, GGIT 538 Stationarity Spatial data usually represent a single realization of a random process  Some degree of stationarity must be assumed to make inferences about the data  Stationarity is a form of location invariance (invariance in the mean and variance of the process).  Stationarity is the quality of a process in which the statistical parameters (mean and standard deviation) of the process do not change with space or time.

53 METU, GGIT 538 Strict or Strong Stationarity Requires equivalence of distribution functions under translation and rotation - all higher-order moments are constant including the variance and mean Weak Stationarity Requires a constant mean and covariance that is independent of location. The covariance is only dependent on distance and direction between points Stationarity

54 METU, GGIT 538 Non-Stationary Mean Decreasing from west to east

55 METU, GGIT 538 Stationarity E(Y(s))=μ (Constant) for all s Є R Cov [Y(s 1 ), Y(s 2 )] = Cov [Y(s 3 ), Y(s 4 )] Cov [Y(s 5 ), Y(s 6 )] = Cov [Y(s 9 ), Y(s 10 )] Cov [Y(s 1 ), Y(s 2 )] ≠ Cov [Y(s 7 ), Y(s 8 )]

56 METU, GGIT 538 Isotropy: The spatial process is called isotropic if the covariance depends only on the distance between s i and s j, not the direction in which they are separated. Figure 2.9. Stationary and isotropic spatial processes  E.g. Weakly magnetized iron particles scattered onto paper with no magnets represents an isotropic process

57 METU, GGIT 538 Isotropic Refers to a spatial process that evolves the same in all directions Anisotropic A spatial process in which the correlation and covariance differs with direction Most methods assume spatial correlation as isotropic

58 METU, GGIT 538 Modeling Spatial Processes Most methods assume spatial correlation is isotropic  Heterogeneity in the mean  Deviations from the mean are stationary


Download ppt "METU, GGIT 538 CHAPTER II REVIEW OF BASIC STATISTICAL CONCEPTS."

Similar presentations


Ads by Google