1.1 Displaying Data Visually Learning goal:Classify data by type Create appropriate graphs
Why do we collect data? We learn by observing Collecting data is a systematic method of making observations Allows others to repeat our observations
Types of Data 1) Quantitative – can be represented by a number Discrete Data Data where a fraction/decimal is not possible e.g., age, number of siblings Continuous Data Data where fractions/decimals are possible E.g., height, weight, academic average 2) Qualitative – cannot be measured numerically e.g., eye colour, surname, favourite band
Who do we collect data from? Population - the entire group from which we can collect data / draw conclusions Data does NOT have to be collected from every member Census – data collected from every member of the pop’n Data is representative of the population Can be time-consuming and/or expensive Sample - data collected from a subset of the pop’n A well-chosen sample will be representative of the pop’n Sampling methods in Ch 2
Organizing Data A frequency table is often used to display data, listing the variable and the frequency. What type of data does this table contain? DayNumber of absences Monday 5 Tuesday 4 Wednesday 2 Thursday 0 Friday 8
Organizing Data (cont’d) Another useful organizer is a stem and leaf plot. This table represents the following data: Stem (first 2 digits) Leaf (last digit)
Organizing Data (cont’d) What type of data is this? The class interval is the size of the grouping , , , etc. No decimals req’d for discrete data Stem can have as many numbers as needed A leaf must be recorded each time the number occurs StemLeaf
Displaying Data – Bar Graphs Typically used for qualitative/discrete data Shows how certain categories compare Why are the bars separated? Would it be incorrect if you didn’t separate them? Number of police officers in Crimeville, 1993 to 2001
Bar graphs (cont’d) Double bar graph Compares 2 sets of data Internet use at Redwood Secondary School, by sex, 1995 to 2002 Stacked bar graph Compares 2 variables
Displaying Data - Histograms Typically used for Continuous data The bars are attached because the x-axis represents intervals
Displaying Data –Pie / Circle Graphs A circle divided up to represent the data Shows each category as a % of the whole
Scatter Plot Shows the relationship (correlation) between two numeric variables May show a positive, negative or no correlation Can be modeled by a line or curve of best fit (regression)
Line Graph Shows long-term trends over time e.g. stock price, price of goods, currency
Box and Whisker Plot Shows the spread of data Divides the data into 4 quartiles Each shows 25% of the data Based on medians
Pictograph Use images (size or quantity) to represent frequency
Timeline Shows a series of events over time
Heat Map Use colours to represent different data ranges Does not have to be a geographical map e.g., Gas Price Temperature
Practice questions p. 11 #2, 3ab, 4, 7, 8
An example… these are prices for Internet service packages find the mean, median and mode State the type of data create a frequency table, a stem and leaf plot, and a graph for the following data
1.2 Conclusions and Issues in Two Variable Data Learning goal: Draw conclusions from two-variable graphs Practice questions: p. 20–24 #1, 4, 11, 14 “Having the data is not enough. [You] have to show it in ways people both enjoy and understand.” - Hans Rosling
Types of statistical relationships Correlation Two variables appear to be related A change in one variable is associated with a change in the other e.g., salary increases as age increases Causation Change in one variable is PROVEN to cause a change in the other requires an in-depth study e.g., incidence of cancer among smokers
What conclusions are possible? To draw a conclusion… Data must address the question Data must represent the population Census, or representative sample (10%)
Case Study – Opinions of school students were surveyed The variables were: Gender Attitude towards school Performance at school
Example 1) What story does this graph tell?
Example 1 – cont’d Majority of females said they like school “quite a bit” or “very much” ~half the males said they like school “a bit” or less ~3 times more males than females said they hate school Conclusion: the females in this study like school more than males do
Example 2a – Is there a correlation between attitude and performance? Larger version on next slide…
Example 2a – cont’d Most students answered “Very well” Only one student said “Poorly” Of the four students who answered “I hate school,” one said he was doing well. It appears that performance correlates with attitude Is 27 out of students enough to make a valid inference? Is the sample representative of the population?
Example 2b – Examine all 1046 students
Example 2b - cont’d From the data, the following conclusions can be made: All students who responded “Very poorly” also responded “I hate school” or “I don’t like school very much.” A larger proportion of students who responded “Poorly” also responded “I hate school” or “I don’t like school very much. It appears that there is a relationship between attitude and performance. Is this correlation or causation?
Drawing Conclusions Do females seem more likely to be interested in student government? Is this a correlation? Does being female CAUSE more interest in student government?