Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introductory Statistics and Data Analysis

Similar presentations


Presentation on theme: "Introductory Statistics and Data Analysis"— Presentation transcript:

1 Introductory Statistics and Data Analysis
MAT 135 Introductory Statistics and Data Analysis Adjunct Instructor Kenneth R. Martin Lecture 11 November 9, 2016

2 Confidential - Kenneth R. Martin
Agenda Housekeeping Exam #2 Readings Chapter 1, 14, 10, 2, 3, 4 & 5 Confidential - Kenneth R. Martin

3 Confidential - Kenneth R. Martin
Housekeeping Exam #2 Confidential - Kenneth R. Martin

4 Confidential - Kenneth R. Martin
Housekeeping Read, Chapter 1.1 – 1.4 Read, Chapter 14.1 – 14.2 Read, Chapter 10.1 Read, Chapter 2 Read, Chapter 3 Read, Chapter 4 Read, Chapter 5 Confidential - Kenneth R. Martin

5 Continuous vs. Discrete vs. Attribute Data
infinite # of possible measurements in a continuum 1 2 3 4 5 6 7 8 9 10 Discrete: Count Discrete: Ordinal 1 2 3 4 5 6 7 8 9 10 “low”/“small”/“short” “high”/”large”/”tall” “medium” / “mid” Discrete: Nominal or Categorical defines several groups - no order Group A Group B Group C Group D Group E Group F Attribute: Binary “bad”/“no-go”/”group #1” “good”/“go”/”group #2 defines TWO groups - no order Confidential - Kenneth R. Martin

6 Discrete Probability Distribution
Examples: Confidential - Kenneth R. Martin

7 Discrete Probability Distribution
Examples: Confidential - Kenneth R. Martin

8 Discrete Probability Distribution
Examples: Confidential - Kenneth R. Martin

9 Discrete Probability Distribution
Theorem 1: Probability of means an event is certain to occur Probability of 0 means the event is certain to NOT occur. Therefore: 0  P(E)  1 Confidential - Kenneth R. Martin

10 Discrete Probability Distribution
Theorem 5: The total (sum) of the probabilities, for any discrete distribution, of all situations equals to 1.000 Confidential - Kenneth R. Martin

11 Discrete Probability Distribution
Theoretical Mean: Confidential - Kenneth R. Martin

12 Discrete Probability Distribution
Theoretical Mean - Example: Confidential - Kenneth R. Martin

13 Discrete Probability Distribution
Variance: Confidential - Kenneth R. Martin

14 Discrete Probability Distribution
Variance - Example: Confidential - Kenneth R. Martin

15 Discrete Probability Distributions
Binomial Distribution: Used for discrete single point (Integer) probabilities. A Binomial probability distribution occurs when there’s a fixed number of “trials” or where there’s a steady stream of items coming from a source. Used for data with two outcomes, (pass / fail, head / tail, etc.); the events are independent, and probability of outcomes do not change. Uses Combination and Simple Multiplication Confidential - Kenneth R. Martin

16 Discrete Probability Distributions
Binomial Distribution: P(d) = Prob. of d nonconforming or target units in sample size n n = # units in sample d = # nonconforming or target units in a sample p0 = proportion nonconforming / targets in population (lot) q0 = proportion conforming / not a target (1-p0) in population (lot) Confidential - Kenneth R. Martin

17 Discrete Probability Distributions
Binomial Distribution (example): Confidential - Kenneth R. Martin

18 Discrete Probability Distributions
Binomial Distribution (example): Confidential - Kenneth R. Martin

19 Discrete Probability Distributions
Binomial Distribution (example): Confidential - Kenneth R. Martin

20 Discrete Probability Distributions
Binomial Distribution table: Confidential - Kenneth R. Martin

21 Discrete Probability Distributions
Binomial Distribution – Mean / Var. & SD: Confidential - Kenneth R. Martin

22 Discrete Probability Distributions
Binomial Distribution (example): Confidential - Kenneth R. Martin

23 Discrete Probability Distributions
Hypergeometric Distribution: Used for discrete single point (Integer) probabilities. A Hypergeometric probability distribution occurs when the population is finite, two outcomes are possible, and the random sample is taken without replacement (trials are not Independent). Uses three Combinations and Simple Multiplication. Confidential - Kenneth R. Martin

24 Discrete Probability Distributions
Hypergeometric Distribution: P(d) = Prob. of d nonconforming / target units in sample size n N = # units in the lot (population) n = # units in the sample D = # nonconforming / target units in the lot d = # nonconforming / target units in the sample N-D = # conforming / not a target in the lot n-d = # conforming / not a target in the sample C = Combinations Confidential - Kenneth R. Martin

25 Discrete Probability Distributions
Hypergeometric Distribution (example): = 3 * 20 126 Confidential - Kenneth R. Martin

26 Discrete Probability Distributions
Hypergeometric Distribution (example): Confidential - Kenneth R. Martin

27 Discrete Probability Distributions
Poisson Distribution: Use for discrete single point (Integer) probabilities. A Poisson probability distribution occurs when n is large and p0 is small. Used for applications of observations per time, or observations per quantity. Confidential - Kenneth R. Martin

28 Discrete Probability Distributions
Poisson Distribution: X = occurrences of events occurring in a sample. λ = average count of events occurring per unit. e = Confidential - Kenneth R. Martin

29 Discrete Probability Distributions
Poisson Distribution (example): Confidential - Kenneth R. Martin

30 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

31 Discrete Probability Distributions
Poisson Distribution (alternate): C = count of events occurring in a sample, i.e. count of non-conformities. np0 = average count of events occurring in population. e = constant = Confidential - Kenneth R. Martin

32 Discrete Probability Distributions
Poisson Distribution: The Poisson distribution formula can be used directly to find probability estimates, or Table C can be used. The table gives point values, and cumulative (parenthesis from top - down) Mean = np0 SD = (np0)1/2 Confidential - Kenneth R. Martin

33 Discrete Probability Distributions
Poisson Distribution (example): Confidential - Kenneth R. Martin

34 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

35 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

36 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

37 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

38 Discrete Probability Distributions
Poisson Distribution Table: Confidential - Kenneth R. Martin

39 Discrete Probability Distributions
Poisson Distribution (example): Confidential - Kenneth R. Martin

40 Continuous vs. Discrete vs. Attribute Data
infinite # of possible measurements in a continuum 1 2 3 4 5 6 7 8 9 10 Discrete: Count Discrete: Ordinal 1 2 3 4 5 6 7 8 9 10 “low”/“small”/“short” “high”/”large”/”tall” “medium” / “mid” Discrete: Nominal or Categorical defines several groups - no order Group A Group B Group C Group D Group E Group F Attribute: Binary “bad”/“no-go”/”group #1” “good”/“go”/”group #2 defines TWO groups - no order Confidential - Kenneth R. Martin

41 Confidential - Kenneth R. Martin
Probability - Review Theorem 5: The total (sum) of the probabilities, for any discrete distribution, of all situations equals to 1.000 Confidential - Kenneth R. Martin

42 Confidential - Kenneth R. Martin
Probability - Review Definition, Theorem 5: Correspondingly, the total area under a continuous probability distribution (normal curve) is equal to also. The tails of the curve never touch the x-axis. Thus, area can be used to estimate probabilities. Confidential - Kenneth R. Martin

43 Confidential - Kenneth R. Martin
Statistics Histogram – by increasing data and thus bins, the fitted line becomes smoother and more accurate Confidential - Kenneth R. Martin

44 Confidential - Kenneth R. Martin
Statistics Histogram – by increasing data and thus bins, the fitted line becomes smoother and more accurate Confidential - Kenneth R. Martin

45 Confidential - Kenneth R. Martin
Statistics Histogram – by increasing data and thus bins, the fitted line becomes smoother and more accurate Confidential - Kenneth R. Martin

46 Confidential - Kenneth R. Martin
Statistics By increasing data, you approach the population, and get a smooth polygon. Confidential - Kenneth R. Martin

47 Confidential - Kenneth R. Martin
Statistics Area Under Curve We can find the area under any curve by 2 methods. We can make a large quantity of really narrow bins, find each individual bin area / rectangle area (under the curve), and add them all up. We can integrate under the curve, to find the area bound by the curve and the X-axis. This method is simpler, and gives more accurate results. Confidential - Kenneth R. Martin

48 Confidential - Kenneth R. Martin
Statistics Equation of a Normal Distribution Y = Confidential - Kenneth R. Martin

49 Confidential - Kenneth R. Martin
Statistics Area Under Curve We may wish to find the area under the curve when, for example: We want to find the number of students whose final semester grade falls between standard grade lettering schemes, and we have a collection of student scores. Or if we want to find the number of people who arrive at a fast food restaurant chain after 11 am, and we have the associated data. Etc. Confidential - Kenneth R. Martin

50 Confidential - Kenneth R. Martin
Statistics Continuous Probability Distribution (aka. CRV) A function of a Continuous Random Variable that describes the likelihood the variable occurs at a certain value within a given set of points by the integral of its density (prob. density) function (i.e. corresponding area under f(x) curve). We shall calculate CRV over ranges Confidential - Kenneth R. Martin

51 Confidential - Kenneth R. Martin
Statistics Continuous Probability Distribution (aka. CRV) So we are seeking to find the area under some curve, y=f(x), bounded by the X-axis, between some values along the x-axis. Confidential - Kenneth R. Martin

52 Confidential - Kenneth R. Martin
Statistics Probability Density Function (cont. prob. dist.) f ( X ) = PDF f ( X) a b X Confidential - Kenneth R. Martin

53 Confidential - Kenneth R. Martin
Statistics Cumulative Density Function – Cross Section f ( X ) = PDF +∞ ∫f(X) dx = 1.0 -∞ f ( X) Sum under entire curve = 1.0 X Confidential - Kenneth R. Martin

54 Confidential - Kenneth R. Martin
Statistics Probability Density Function (cont. prob. dist.) = p(x≤b) - p(x≤a) = F(b) - F(a) = Entire area under curve to section(b) minus Entire area under curve to section(a) Sum under entire curve = 1.0 Curve typically read left to right Confidential - Kenneth R. Martin

55 Confidential - Kenneth R. Martin
Statistics Cumulative Density Function f ( X ) = PDF t P(X<t)=∫f(X) dx = F(t) -∞ f(X) t F(t) X Confidential - Kenneth R. Martin

56 Confidential - Kenneth R. Martin
Statistics Cumulative Density Function F(t) + R(t) = 1.0 Confidential - Kenneth R. Martin

57 Confidential - Kenneth R. Martin
Statistics Normal Curve AKA, Gaussian distribution of CRV. Mean, Median, and Mode have the approx. same value. Associated with mean () at center and dispersion () X  N(,) [when a random variable x is distributed normally] Observations have equal likelihood on both sides of mean *** When normally distributed, Mean is used to describe Central Tendency The graph of the associated probability density function is called “Bell Shaped” Confidential - Kenneth R. Martin

58 Confidential - Kenneth R. Martin
Statistics Normal Curve Developed from a frequency histogram, with  sample size,  intervals (bin width), the associated curve becomes smooth. Typical of much data and distributions in reality. The basis for most quality control techniques, formulas, and assumptions. However, different Normal Distributions (pdf’s) can have varying means and SD’s. The means and SD’s are independent (i.e. the mean does not effect the SD, and vice versa) Confidential - Kenneth R. Martin

59 Confidential - Kenneth R. Martin
Statistics Various Normal Curves (Different means, common SD) Confidential - Kenneth R. Martin

60 Confidential - Kenneth R. Martin
Statistics Various Normal Curves (Different SD’s, common means) Confidential - Kenneth R. Martin

61 Confidential - Kenneth R. Martin
Statistics Various Normal Curves Confidential - Kenneth R. Martin

62 Confidential - Kenneth R. Martin
Statistics Standardized Normal Value There are an infinite combination of mean and SD’s for normal curves. Thus, the shapes of any two normal curves will be different. To find the area under any normal curve, we can use the two methods previously described (rectangles and integration). Or, we can use the Standard Normal Approach, thus using tables to find the area under the curve, and thus probabilities. Standard Normal Distribution: N (0,1) Confidential - Kenneth R. Martin

63 Confidential - Kenneth R. Martin
Statistics Standardized Normal Value Standard Normal Distribution has a Mean=0 and a SD=1 Standard Normal Transformation (z-Transformation), converts any normal distribution with any mean and any SD to a Standard Normal Distribution with mean 0 and SD 1 Standard Normal Distribution is distributed in “z-score” units, along the associated x-axis. Z-score specifies the number of SD units a value is above or below the mean (i.e. z = +1 indicates a value 1 SD above the mean). A formula is used to convert your mean and SD to a z-score. Confidential - Kenneth R. Martin

64 Confidential - Kenneth R. Martin
Statistics Normal Curve - Distribution of Data Confidential - Kenneth R. Martin

65 Confidential - Kenneth R. Martin
Statistics Standard Normal Curve - Distribution of Data (z-scores) Confidential - Kenneth R. Martin

66 Confidential - Kenneth R. Martin
Statistics Normal Curve - Distribution of Data Confidential - Kenneth R. Martin

67 Confidential - Kenneth R. Martin
Statistics Standard Normal Distribution (z-scores) Confidential - Kenneth R. Martin

68 Confidential - Kenneth R. Martin
Statistics Standardized Normal Value Confidential - Kenneth R. Martin

69 Confidential - Kenneth R. Martin
Statistics Normal distribution example Confidential - Kenneth R. Martin

70 Confidential - Kenneth R. Martin
Statistics Standard Normal Distribution example Confidential - Kenneth R. Martin

71 Confidential - Kenneth R. Martin
Statistics Standardized Normal Table Confidential - Kenneth R. Martin

72 Confidential - Kenneth R. Martin
Statistics Standardized Normal Table Confidential - Kenneth R. Martin


Download ppt "Introductory Statistics and Data Analysis"

Similar presentations


Ads by Google