Download presentation
Presentation is loading. Please wait.
Published byRay Gasson Modified over 9 years ago
1
CHAPTER 2 2 2.1 - Basic Definitions and Properties P opulation Characteristics = “Parameters” S ample Characteristics = “Statistics” R andom Variables (Numerical vs. Categorical) 2 2.2, 2.3 - Exploratory Data Analysis G raphical Displays D escriptive Statistics M Measures of Center (mode, median, mean) easures of Spread (range, variance, standard deviation)
2
Quantitative [measurement] length mass temperature pulse rate # puppies shoe size 2 POPULATION – composed of “units” (people, rocks, toasters,...) “Random Variable” X = any numerical value that can be assigned to each unit of a population “Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). [Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.] There are two general types......... Quantitative and Qualitative Important Fact: To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite). 1010½11 What do we want to know about this population?
3
Quantitative [measurement] length mass temperature pulse rate # puppies shoe size CONTINUOUS (can take their values at any point in a continuous interval) DISCRETE (only take their values in disconnected jumps) 3 POPULATION – composed of “units” (people, rocks, toasters,...) “Random Variable” X = any numerical value that can be assigned to each unit of a population “Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). [Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.] There are two general types......... Quantitative and Qualitative Important Fact: To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite). What do we want to know about this population?
4
Qualitative [categorical] video game levels (1, 2, 3,...) income level (1 = low, 2 = mid, 3 = high) zip code ID # color (Red, Green, Blue) ORDINAL, RANKED 1 2 3 IMPORTANT CASE: Binary (or Dichotomous) Gender (Male / Female) “Pregnant?” (Yes / No) Coin toss (Heads / Tails) Treatment (Drug / Placebo) 1, “Success” 0, “Failure” X = 4 NOMINAL POPULATION – composed of “units” (people, rocks, toasters,...) “Random Variable” X = any numerical value that can be assigned to each unit of a population “Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). [Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.] There are two general types......... Quantitative and Qualitative Important Fact: To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite). What do we want to know about this population?
5
Qualitative [categorical] video game levels (1, 2, 3,...) income level (1 = low, 2 = mid, 3 = high) zip code ID # color (Red, Green, Blue) ORDINAL, RANKED 1 2 3 IMPORTANT CASE: Binary (or Dichotomous) Gender (Male / Female) “Pregnant?” (Yes / No) Coin toss (Heads / Tails) Treatment (Drug / Placebo) 1, “Success” 0, “Failure” X = 5 NOMINAL POPULATION – composed of “units” (people, rocks, toasters,...) “Random Variable” X = any numerical value that can be assigned to each unit of a population “Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). [Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.] There are two general types......... Quantitative and Qualitative Important Fact: To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite). Another way… define X using “indicator variables”: What do we want to know about this population? Note that I 1 + I 2 + I 3 = 1 Example: Excel file of patient blood types Note that each patient row sums to 1, i.e., O + A + B + AB = 1. Note that each patient row sums to 1, i.e., O + A + B + AB = 1.
6
X “Population Distribution of X” (somewhat idealized) “Population Distribution of X” (somewhat idealized) X POPULATION – composed of “units” (people, rocks, toasters,...) Important Fact: To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite). “Random Variable” X = any numerical value that can be assigned to each unit of a population “Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). [Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.] There are two general types......... Quantitative and Qualitative Population mean 6 Population “standard deviation” (“mu”) and (“sigma”) are examples of parameters – nonrandom “population characteristics” whose exact values cannot be directly measured, but can (hopefully) be estimated from known “sample characteristics” – statistics.
7
POPULATION – composed of “units” (people, rocks, toasters,...) = value of X for 1 st individual 7 x1x1 = value of X for 2 nd individual x2x2 x3x3 x4x4 x5x5 x6x6 …etc…. xnxn SAMPLE of size n How do we infer information about the population variable X? Random variable X (Example: X = Age) “Population Distribution of X” (somewhat idealized) X
8
x 1 + x 2 + x 3 + x 4 + x 5 + x 6 + … + x n 8 “Population Distribution of X” (somewhat idealized) X Random variable X (Example: X = Age) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 …etc…. xnxn SAMPLE of size n x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 xnxn n x = “Parameter Estimation” “Statistical Inference” There are many potential random samples of a fixed size n, each with its own estimate of µ. It will eventually become important to understand the structure of their variability. Sample mean An example of a statistic Sample mean An example of a statistic POPULATION – composed of “units” (people, rocks, toasters,...)
9
Statistics are numerical values that are culled from a random sample of measurements taken from a specific population, in an effort to “summarize” its overall distribution, and estimate certain parameters (i.e., numerical characteristics) of that population. Statistics – as a discipline – consists of a collection of formal testing procedures, designed to infer a conclusion regarding a specific hypothesis about the population, based on the sample data. Statistics is sometimes referred to as the “search for sources of random variation” in a system. How much of a signal is genuinely significant information to be detected, and how much is random “noise”? The “classical scientific method” provides a general framework for conducting formal statistical analysis. 9
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.