Basics of Statistics
Statistics the science of collecting, analyzing, and drawing conclusions from data
Descriptive statistics the methods of organizing & summarizing data
Inferential statistics involves making generalizations from a sample to a population This will be the 2nd semester
Population The entire collection of individuals or objects about which information is desired
Sample A subset of the population, selected for study in some prescribed manner
Data observations on single variable or simultaneously on two or more variables
Context (the W’s) Tells us WHO was measured WHAT was measured WHEN the study was performed WHERE the data was collected WHY the study was performed HOW the data were collected
Variable any characteristic whose value may change from one individual to another
Types of variables
Types of Variables Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical).
Categorical variables Categorical variables take on values that are names or labels. or qualitative identifies basic differentiating characteristics of the population
Is it or isn’t it? Determine whether a variable is categorical or not. Favorite Color Height Zip Code Number of tweets Shoe Size Breed of dog Categorical Not Categorical Not Not Categorical
Card Sort Do the Card Sort Activity with your partner. Once completed, check your answers with your group. Prepare to discuss results.
Card Sort Answers Categorical Variables: Brand of toothpaste College Major Telephone Company Checking Account Location School Attended Customer Service Survey School Mascot
Quantitative variables or numerical observations or measurements take on numerical values have units (dollars, hours, etc) two types - discrete & continuous
Discrete (numerical) listable set of values usually counts of items
Continuous (numerical) data can take on any values in the domain of the variable usually measurements of something
Classification by the number of variables Univariate - data that describes a single characteristic of the population Bivariate - data that describes two characteristics of the population Multivariate - data that describes more than two characteristics (beyond the scope of this course
Identify the following: gender age Right or left handed How you got to school Number of movies number of girls in class Fastest speed driven categorical numerical
Types of Distributions 4 common types
Symmetrical refers to data in which both sides are (more or less) the same when the graph is folded vertically down the middle bell-shaped is a special type has a center mound with two sloping tails
Uniform refers to data in which every class has equal or approximately equal frequency
Skewed (left or right) refers to data in which one side (tail) is longer than the other side the direction of skewness is on the side of the longer tail
Bimodal (multi-modal) refers to data in which two (or more) classes have the largest frequency & are separated by at least one other class