Download presentation
Presentation is loading. Please wait.
Published byEgbert Lamb Modified over 9 years ago
3
Mathematical Notation The mathematical notation used most often in this course is the summation notation The Greek letter is used as a shorthand way of indicating that a sum is to be taken: The expression is equivalent to:
4
A summation will often be written leaving out the upper and/or lower limits of the summation, assuming that all of the terms available are to be summed Summation Notation: Simplification
5
Summation Notation: Rules Rule I: Summing a constant n times yields a result of na: Here we are simply using the summation notation to carry out a multiplication, e.g.:
6
Summation Notation: Rules Rule II: Constants may be taken outside of the summation sign
7
Summation Notation: Rules Rule III: The order in which addition operations are carried out is unimportant +
8
Rule IV: Exponents are handled differently depending on whether they are applied to the observation term or the whole sum Summation Notation: Rules
9
Rule V: Products are handled much like exponents Summation Notation: Rules
10
Pi Notation Whereas the summation notation refers to the addition of terms, the product notation applies to the multiplication of terms It is denoted by the following capital Green letter (pi), and is used in the same way as the summation notation
11
There is also a convention that 0! = 1 Factorials are not defined for negative integers or nonintegers Factorial The factorial of a positive integer, n, is equal to the product of the first n integers Factorials can be denoted by an exclamation point
12
Combinations Combinations refer to the number of possible outcomes that particular probability experiments may have Specifically, the number of ways that r items may be chosen from a group of n items is denoted by: or
13
Descriptive Statistics Measures of central tendency –Measures of the location of the middle or the center of a distribution –Mean, median, mode Measures of dispersion –Describe how the observations are distributed –Variance, standard deviation, range, etc
14
Measures of Central Tendency – Mean Mean – Most commonly used measure of central tendency Note: Assuming that each observation is equally significant Sensitive to outliers
15
Measures of Central Tendency – Mean A standard geographic application of the mean is to locate the center (centroid) of a spatial distribution Assign to each member a gridded coordinate and calculating the mean value in each coordinate direction --> Bivariate mean or mean center For a set of (x, y) coordinates, the mean center is calculated as:
16
Weighted Mean We can also calculate a weighted mean using some weighting factor:
17
Median – This is the value of a variable such that half of the observations are above and half are below this value i.e. this value divides the distribution into two groups of equal size When the number of observations is odd, the median is simply equal to the middle value When the number of observations is even, we take the median to be the average of the two values in the middle of the distribution Measures of Central Tendency – Median
18
Mode - This is the most frequently occurring value in the distribution This is the only measure of central tendency that can be used with nominal data The mode allows the distribution's peak to be located quickly Measures of Central Tendency – Mode
19
Which one is better: mean, median, or mode? Most often, the mean is selected by default The mean's key advantage is that it is sensitive to any change in the value of any observation The mean's disadvantage is that it is very sensitive to outliers We really must consider the nature of the data, the distribution, and our goals to choose properly
20
Some Characteristics of Data Not all data is the same. There are some limitations as to what can and cannot be done with a data set, depending on the characteristics of the data Some key characteristics that must be considered are: A. Continuous vs. Discrete B. Grouped vs. Individual C. Scale of Measurement
21
C. Scales of Measurement The data used in statistical analyses can be divided into four types: 1. The Nominal Scale 2. The Ordinal Scale 3. The interval Scale 4. The Ratio Scale As we progress through these scales, the types of data they describe have increasing information content
22
The Nominal Scale Nominal scale data are data that can simply be broken down into categories, i.e., having to do with names or types Dichotomous or binary nominal data has just two types, e.g., yes/no, female/male, is/is not, hot/cold, etc Multichotomous data has more than two types, e.g., vegetation types, soil types, counties, eye color, etc Not a scale in the sense that categories cannot be ranked or ordered (no greater/less than)
23
The Ordinal Scale Ordinal scale data can be categorized AND can be placed in an order, i.e., categories that can be assigned a relative importance and can be ranked such that numerical category values have –star-system restaurant rankings 5 stars > 4 stars, 4 stars > 3 stars, 5 stars > 2 stars BUT ordinal data still are not scalar in the sense that differences between categories do not have a quantitative meaning –i.e., a 5 star restaurant is not superior to a 4 star restaurant by the same amount as a 4 star restaurant is than a 3 star restaurant
24
The Interval Scale Interval scale data take the notion of ranking items in order one step further, since the distance between adjacent points on the scale are equal For instance, the Fahrenheit scale is an interval scale, since each degree is equal but there is no absolute zero point. This means that although we can add and subtract degrees (100° is 10° warmer than 90°), we cannot multiply values or create ratios (100° is not twice as warm as 50°)
25
The Ratio Scale Similar to the interval scale, but with the addition of having a meaningful zero value, which allows us to compare values using multiplication and division operations, e.g., precipitation, weights, heights, etc e.g., rain – We can say that 2 inches of rain is twice as much rain as 1 inch of rain because this is a ratio scale measurement e.g., age – a 100-year old person is indeed twice as old as a 50-year old one
26
Scales of Measurements & Measures of Central Tendency The mean is valid only for interval data or ratio data. The median can be determined for ordinal data as well as interval and ratio data. The mode can be used with nominal, ordinal, interval, and ratio data Mode is the only measure of central tendency that can be used with nominal data
27
Measures of Dispersion Measures of dispersion are concerned with the distribution of values around the mean in data: 1.Range 2.Interquartile range 3.Variance 4.Standard deviation 5.z-scores 6.Coefficient of Variation (CV)
28
Measures of Dispersion - Range 1.Range – this is the most simply formulated of all measures of dispersion Given a set of measurements x 1, x 2, x 3, …,x n-1, x n, the range is defined as the difference between the largest and smallest values: Range = x max – x min This is another descriptive measure that is vulnerable to the influence of outliers in a data set, which result in a range that is not really descriptive of most of the data
29
Quartiles – We can divide distributions into four parts each containing 25% of observations Percentiles – each contains 1% of all values Interquartile range – The difference between the 25th and 75th percentiles Measures of Dispersion – Interquartile Range
30
Variance is formulated as the sum of squares of statistical distances divided by the population size or the sample size minus one Measures of Dispersion – Variance
31
Measures of Dispersion – Standard Deviation Standard deviation is equal to the square root of the variance Compared with variance, standard deviation has a scale closer to that used for the mean and the original data
32
Since data come from distributions with different means and difference degrees of variability, it is common to standardize observations One way to do this is to transform each observation into a z-score May be interpreted as the number of standard deviations an observation is away from the mean Measures of Dispersion – z-score
33
Measures of Dispersion – Coefficient of Variation Coefficient of variation (CV) measures the spread of a set of data as a proportion of its mean. It is the ratio of the sample standard deviation to the sample mean It is sometimes expressed as a percentage There is an equivalent definition for the coefficient of variation of a population
34
Skewness measures the degree of asymmetry exhibited by the data If skewness equals zero, the histogram is symmetric about the mean Positive skewness vs negative skewness Further Moments – Skewness
35
Mode Median Mean A B
36
Further Moments – Kurtosis Kurtosis measures how peaked the histogram is The kurtosis of a normal distribution is 0 Kurtosis characterizes the relative peakedness or flatness of a distribution compared to the normal distribution
37
Source: http://espse.ed.psu.edu/Statistics/Chapters/Chapter3/Chap3.html
38
Functions of a Histogram The function of a histogram is to graphically summarize the distribution of a data set The histogram graphically shows the following: 1. Center (i.e., the location) of the data 2. Spread (i.e., the scale) of the data 3. Skewness of the data 4. Kurtosis of the data 4. Presence of outliers 5. Presence of multiple modes in the data.
39
Box Plots We can also use a box plot to graphically summarize a data set A box plot represents a graphical summary of what is sometimes called a “five-number summary” of the distribution –Minimum –Maximum –25 th percentile –75 th percentile –Median Interquartile Range (IQR) Rogerson, p. 8. min. max. 25 th %-ile 75 th %-ile median
40
Probability-Related Concepts An event – Any phenomenon you can observe that can have more than one outcome (e.g., flipping a coin) An outcome – Any unique condition that can be the result of an event (e.g., flipping a coin: heads or tails), a.k.a simple event or sample points Sample space – The set of all possible outcomes associated with an event –e.g., flip a coin – heads (H) and tails (T) –e.g., flip a coin twice – HH, HT, TH, TT
41
Associated with each possible outcome in a sample space is a probability Probability is a measure of the likelihood of each possible outcome Probability measures the degree of uncertainty Each of the probabilities is greater than or equal to zero, and less than or equal to one The sum of probabilities over the sample space is equal to one Probability-Related Concepts
42
How To Assign Probabilities to Experimental Outcomes? There are numerous ways to assign probabilities to the elements of sample spaces Classical method assigns probabilities based on the assumption of equally likely outcomes Relative frequency method assigns probabilities based on experimentation or historical data Subjective method assigns probabilities based on the assignor’s judgment or belief
43
Probability Rules Rules for combining multiple probabilities A useful aid is the Venn diagram - depicts multiple probabilities and their relations using a graphical depiction of sets The rectangle that forms the area of the Venn Diagram represents the sample (or probability) space, which we have defined above Figures that appear within the sample space are sets that represent events in the probability context, & their area is proportional to their probability (full sample space = 1) AB
44
Discrete & Continuous Variables Discrete variable – A variable that can take on only a finite number of values –# of malls within cities –# of vegetation types within geographic regions –# population Continuous variable – A variable that can take on an infinite number of values (all real number values) –Elevation (e.g., [500.0, 1000.0]) –Temperature (e.g., [10.0, 20.0]) –Precipitation (e.g., [100.0, 500.0]
45
Probability Mass Functions A discrete random variable can be described by a probability mass function (pmf) A probability mass function is usually represented by a table, graph, or equation The probability of any outcome must satisfy: 0 <= p(X=x i ) <= 1 i = 1, 2, 3, …, k-1, k The sum of all probabilities in the sample space must total one, i.e.
46
The probability of a continuous random variable X within an arbitrary interval is given by: Simply calculate the shaded shaded area if we know the density function, we could use calculus x f(x) ab
47
Discrete Probability Distributions Discrete probability distributions –The Uniform Distribution –The Binomial Distribution –The Poisson Distribution Each is appropriately applied in certain situations and to particular phenomena
48
The Binomial Distribution Provides information about the probability of the repetition of events when there are only two possible outcomes, –e.g. heads or tails, left or right, success or failure, rain or no rain … –Events with multiple outcomes may be simplified as events with two outcomes (e.g., forest or non-forest) Characterizing the probability of a proportion of the events having a certain outcome over a specified number of events
49
The Binomial Distribution A general formula for calculating the probability of x successes (n trials & a probability p of success: where C(n,x) is the number of possible combinations of x successes and (n –x) failures: P(x) = C(n,x) * p x * (1 - p) n - x C(n,x) = n! x! * (n – x)!
50
Source: http://home.xnet.com/~fidler/triton/math/review/mat170/probty/p-dist/discrete/Binom/binom1.htm
51
The Poisson Distribution In the 1830s, S.D. Poisson described a distribution with these characteristics Describing the number of events that will occur within a certain area or duration (e.g. # of meteorite impacts per state, # of tornados per year, # of hurricanes in NC) Poisson distribution’s characteristics: 1. It is used to count the number of occurrences of an event within a given unit of time, area, volume, etc. (therefore a discrete distribution)
52
The Poisson Distribution 2. The probability that an event will occur within a given unit must be the same for all units (i.e. the underlying process governing the phenomenon must be invariant) 3. The number of events occurring per unit must be independent of the number of events occurring in other units (no interactions) 4. The mean or expected number of events per unit (λ) is found by past experience (observations)
53
The Poisson Distribution Poisson formulated his distribution as follows: P(x) = e - * x x! To calculate a Poisson distribution, you must know λ where e = 2.71828 (base of the natural logarithm) λ = the mean or expected value x = 1, 2, …, n – 1, n # of occurrences x! = x * (x – 1) * (x – 2) * … * 2 * 1
54
The Poisson Distribution Procedure for finding Poisson probabilities and expected frequencies: (1) Set up a table with five columns as on the previous slide (2) Multiply the values of x by their observed frequencies (x * F obs ) (3) Sum the columns of F obs (observed frequency) and x * F obs (4) Compute λ = Σ (x * F obs ) / Σ F obs (5) Compute P(x) values using the equation or a table (6) Compute the values of Fexp = P(x) * Σ F obs
55
Source: http://www.mpimet.mpg.de/~vonstorch.jinsong/stat_vls/s3.pdf
56
The probability density function of the normal distribution: You can see how the value of the distribution at x is a f(x) of the mean and standard deviation The Normal Distribution
57
Standardization of Normal Distributions The standardization is achieved by converting the data into z-scores z-score The z-score is the means that is used to transform our normal distribution into a standard normal distribution ( = 0 & = 1)
58
Finding the P(x) for Various Intervals a 3. a 2. a 1. P(0 Z a) = [0.5 – (table value)] Total Area under the curve = 1, thus the area above x is equal to 0.5, and we subtract the area of the tail P(Z a) = [1 – (table value)] Total Area under the curve = 1, and we subtract the area of the tail P(Z a) = (table value) Table gives the value of P(x) in the tail above a
59
Finding the P(x) for Various Intervals a 6. a 5. a 4. P(Z a) = [1 – (table value)] This is equivalent to P(Z a) when a is positive P(Z a) = (table value) Table gives the value of P(x) in the tail below a, equivalent to P(Z a) when a is positive P(a Z 0) = [0.5 – (table value)] This is equivalent to P(0 Z a) when a is positive
60
Finding the P(x) for Various Intervals P(a Z b) if a 0 a 7. b With this set of building blocks, you should be able to calculate the probability for any interval using a standard normal table or = [0.5 – (table value for a)] + [0.5 – (table value for b)] = [1 – {(table value for a) + (table value for b)}] = (0.5 – P(Z b)) = 1 – P(Z b)
61
Confidence Interval & Probability A confidence interval is expressed in terms of a range of values and a probability (e.g. my lectures are between 60 and 70 minutes long 95% of the time) For this example, the confidence level that I used is the 95% level, which is the most commonly used confidence level Other commonly selected confidence levels are 90% and 99%, and the choice of which confidence level to use when constructing an interval often depends on the application
62
The Central Limit Theorem Given a distribution with a mean μ and variance σ 2, the sampling distribution of the mean approaches a normal distribution with a mean (μ) and a variance σ 2 /n as n, the sample size, increases The amazing and counter- intuitive thing about the central limit theorem is that no matter what the shape of the original (parent) distribution, the sampling distribution of the mean approaches a normal distribution
63
Confidence Intervals for the Mean Generally, a (1- α)*100% confidence interval around the sample mean is: Where z α is the value taken from the z-table that is associated with a fraction α of the weight in the tails (and therefore α/2 is the area in each tail) Standard error margin of error
64
Constructing a Confidence Interval 1. Select our desired confidence level (1-α)*100% 2. Calculate α and α/2 3. Look up the corresponding z-score in a standard normal table 4. Multiply the z-score by the standard error to find the margin of error 5. Find the interval by adding and subtracting this product from the mean
65
t-distribution The central limit theorem applies when the sample size is “large”, only then will the distribution of means possess a normal distribution When the sample size is not “large”, the frequency distribution of the sample means has what is known as the t-distribution t-distribution is symmetric, like the normal distribution, but has a slightly different shape The t distribution has relatively more scores in its tails than does the normal distribution. It is therefore leptokurtic
66
Assignment III Probability, Discrete, and Continuous Distributions Due: 03/07/2006 (Tuesday) http://www.unc.edu/courses/2006spring/geog/090/001/www/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.