Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics YEAR TEN MATHS FOR FURTHER SEMESTER TWO.

Similar presentations


Presentation on theme: "Statistics YEAR TEN MATHS FOR FURTHER SEMESTER TWO."— Presentation transcript:

1 Statistics YEAR TEN MATHS FOR FURTHER SEMESTER TWO

2 Categorical vs Numerical Data Data can be divided in to two major groups – Numerical Data Categorical Data

3 Categorical vs Numerical Data Numerical Data is in the form of numbers and can be classed as either: Discrete – Numbers counted in exact values, usually whole numbers. eg. Goals Scored in a footy match, Number of children in a family Continuous – Numbers measured in a continuous decimal scale. eg. Mass of an object, Time, Length, Temperature

4 Categorical vs Numerical Data Categorical Data is can be classed in two separate categories: Nominal – Requires sub-groups (names) to complete the description eg. Hair Colour (Brown, Blonde, Black etc.) Ordinal – Requires sub-groups in terms of ranking to order the description eg. Level of Achievement ( Excellent, Very Good, Good, Poor) Size of Pizza (Small, Medium, Large, Family)

5 What type of data is…..? The number of goals kicked per match of footy. The types of vehicles driving along a road. The sizes of pizza available at a pizza shop. The varying temperature outside throughout the day. Numerical – Discrete Categorical – Nominal Categorical – Ordinal Numerical - Continuous

6 Now Do Introduction to Statistics Worksheet Question 1

7 Working with Categorical Data Once data has been collected, it is important to be able to display it in a meaningful way, using a range of different charts, including: Frequency Tables Graphs – Column / Bar Chart Dot Plots Gap at the start of the plot & Gaps between each bar

8 Working with Numerical Data Frequency Tables Histograms Ungrouped data Grouped Data Gap at start, Columns joined

9 Working with Numerical Data Stem and Leaf Plots Dot Plots

10 Working with Data eg1. The chart below shows the marital status of 40 respondents to a survey. a)What type of data is this? b)What type of chart is this? Why? c)What is the most common marital status and how many respondents are in this category? d) How many respondents are marked ‘never married’ ? Categorical – Nominal Bar chart – gaps between columns Married - 15 respondents 12

11 Working with Data eg. 60 packets of jellybeans were opened and the number of jellybeans within them counted. a)What type of data is this? b) How many packets had 51 jellybeans? c) Would we display this data on a histogram or a bar chart? Why? d) Plot the data on the chart you chose in part c Numerical - Discrete 9 Histogram – Numerical Data

12 Working with Data eg. 60 packets of jellybeans were opened and the number of jellybeans within them counted. d) Plot the data on the chart you chose in part c

13 Working with Data Class Hair Colour Survey Gather Data of the students in the classroom and use it to: 1. Summarise data using a frequency table 2. Represent data using a graph

14 Working with Data Class Hair Colour Survey Hair ColourTallyTotal Brown Blonde Black Red Other 1. Summarise data using a frequency distribution table

15 Working with Data Class Hair Colour Survey 2. Represent data using a bar chart Remember – In a bar chart the bars don’t touch. Leave gaps! Hair Colour Frequency

16 Now Do Introduction to Statistics Worksheet Question 2

17 Frequency, Relative Frequency and Percentage Frequency We can investigate how often a particular event occurs using the following: Frequency Relative Frequency % Frequency The number of times that a particular event has occurred The relative frequency × 100

18 Frequency, Relative Frequency and Percentage Frequency eg1. The frequency table pictured shows the size of 30 pizzas ordered from Pizza Hut on Monday night. a)Find the Frequency of a Medium Pizza being ordered b) Find the Relative Frequency of a medium pizza being ordered c) Find the % Frequency of a medium pizza being ordered 12

19 Frequency, Relative Frequency and Percentage Frequency eg2. A group of 20 people were asked how many times they attended the cinema this month. Results are shown on the histogram. a)Find the Frequency of attending the cinema twice a month. b) Find the Relative Frequency of attending the cinema twice a month c) Find the % Frequency of attending the cinema twice a month 4

20 Now Do Introduction to Statistics Worksheet Question 3

21 Data Distribution We can name data according to how it’s distributed. Is it all crammed together or is there more data in certain areas?? We associate certain names with different shapes of distribution Normal – Most common score in the centre of the data Skewed – Most common score is toward one end of the data Bimodal – More than one score that is most frequent Spread – Data is spread over a wide range Clustered – Most of the data is confined to a small range

22 Data Distribution Normally Distributed Data The most common score in the centre of the data. The graph is symmetrical.

23 Data Distribution Skewed Data The most common score is toward one end of the data. Most data toward the left – Positively Skewed Most data toward the right – Negatively Skewed

24 Data Distribution Bimodal Data More than one score that is most frequent This looks like two peaks on the graph

25 Data Distribution Spread Data Data is rather evenly spread over a wide range

26 Data Distribution Clustered Data Most of the data is confined to a small range

27 Grouping Data For some sets of data it is appropriate to group it before plotting it. When grouping data, we usually use a group size or ‘class size or interval’ of 5 or 10. eg1. Group the following 20 test scores using a class size of 10. 90, 77, 68, 72, 88, 83, 45, 51, 54, 41, 97, 78, 81, 61, 55, 93, 74, 71, 78, 64 Test Score Tally Frequency 2 3 3 6 3 3 40 - 50 - 60 - 70 - 80 - 90 -

28 Grouping Data eg1. Now represent the grouped data using a histogram 40 50 60 70 80 90 100 0 Test score Frequency 6 543216 54321

29 Grouping Data Stem Leaf 4 1, 5 5 1, 4, 5 6 1, 4, 8 7 1, 2, 4, 7, 8, 8 8 1, 3, 8 9 0, 3, 7 eg1. Now represent the grouped data using a stem and leaf plot, with a class size of 10 90, 77, 68, 72, 88, 83, 45, 51, 54, 41, 97, 78, 81, 61, 55, 93, 74, 71, 78, 64 Key: 4 | 1 = 41

30 Grouping Data For some sets of data it is appropriate to group it before plotting it. When grouping data, we usually use a group size or ‘class size or interval’ of 5 or 10. eg2. Group the following 20 scores using a class size of 5. 10, 4, 6, 13, 18, 9, 7, 14, 21, 23, 8, 15, 19, 22, 14, 15, 17, 3, 9, 11 Score Tally Frequency 2 5 5 5 3 0 - 5 - 10 - 15 - 20 -

31 Grouping Data eg2. Now represent the grouped data using a histogram 0 5 10 15 20 25 Score Frequency 54321 54321

32 Grouping Data Stem Leaf 0 3, 4 0 * 6, 7, 8, 9, 9 1 0, 1, 3, 4, 4 1 * 5, 5, 7, 8, 9 2 1, 2, 3 eg2. Now represent the grouped data using a stem and leaf plot, with a class size of 5 10, 4, 6, 13, 18, 9, 7, 14, 21, 23, 8, 15, 19, 22, 14, 15, 17, 3, 9, 11 Key: 0 | 3 = 3 0* | 6 = 6

33 Now Do Statistics Worksheet 2 Question 1 and 2

34 Measures of Centre

35 Mean

36 Median Median - The middle score of an ordered set of data with ‘n’ pieces of data eg. Find the median of the data set: 42671037367 Solution: Write the scores in order smallest to largest 2 3 3 4 6 6 7 7 7 10 Median = 6

37 Mode The Mode - The most commonly occurring number in the set of data eg. Find the mode of the data set: 42671037367 Solution: You may wish to write the scores in order to ensure all data is accounted for but this is not necessary. 2 3 3 4 6 6 7 7 7 10 There can be one or more than one score which occurs most frequently. In these cases they are both modes – list them both. Mode = 7

38 Median = 6

39 Lets try to now solve the same problem using the ‘STATISTICS’ function on our calculators. 2, 3, 4, 4, 6, 7, 8, 9, 10 Using the classpad to solve

40 Lets try to now solve the same problem using the ‘STATISTICS’ function on our calculators. 2, 3, 4, 4, 6, 7, 8, 9, 10 Using the classpad to solve Mean Mode Median

41 Now Do Statistics Worksheet 2 Question 3

42 Then Do Work Record Exercise 9A pg564 Questions 1, 2, 3, 4, 5, 6, 7, 8, 11

43 Quartiles Another way to analyse a set of data is to create a 5-figure summary. These summarise the data in terms of quartiles – ie. it divides the data set into quarters. To create a 5-figure summary we find the following: Minimum Value (Min) Lower Quartile (Q1) – The number 25% (a quarter) through the data Median (Q2) – The number 50% (halfway/the centre) through the data Upper Quartile (Q3) – The number 75% (three quarters) through the data Maximum Value (Max)

44 What does a 5-figure summary look like? Minimum Value (Min) Lower Quartile (Q1) – The number 25% (a quarter) through the data Median (Q2) – The number 50% (halfway/the centre) through the data Upper Quartile (Q3) – The number 75% (three quarters) through the data Maximum Value (Max) eg. Find the 5-figure summary for the data set: 1, 1, 3, 4, 5, 6, 7, 7, 8 Min = 1 Max = 8 Med (Q2) = 5 Q 1 = 2 Q 3 = 7 Min, Q 1, Med, Q 3, Max 1, 2, 5, 7, 8

45 What does a 5-figure summary look like? Minimum Value (Min) Lower Quartile (Q1) – The number 25% (a quarter) through the data Median (Q2) – The number 50% (halfway/the centre) through the data Upper Quartile (Q3) – The number 75% (three quarters) through the data Maximum Value (Max) eg. Find the 5-figure summary for the data set: 10, 11, 13, 13, 15, 16, 17, 19 Min = 10 Max = 19 Med (Q2) = 14 Q 1 = 12 Q 3 = 16.5 Min, Q 1, Med, Q 3, Max 10, 12, 14, 16.5, 19 Q2 = 14

46 What does a 5-figure summary look like? Lets confirm our result using the calculator: 10, 11, 13, 13, 15, 16, 17, 19 Min, Q 1, Med, Q 3, Max 10, 12, 14, 16.5, 19

47 What does a 5-figure summary look like? Lets confirm our result using the calculator: 10, 11, 13, 13, 15, 16, 17, 19 Min, Q 1, Med, Q 3, Max 10, 12, 14, 16.5, 19

48 Now Do Worksheet 3 Question 1

49 Measures of Spread We can use the following to determine how spread out our data set is. Range Interquartile Range (IQR) eg. Find the Range and the Interquartile Range for the data set: 3, 3, 4, 6, 7, 8, 10 Range = 10 – 3 = 7 IQR = 8 – 3 = 5 = Maximum Value – Minimum Value = Q 3 – Q 1

50 Measures of Spread We can use the following to determine how spread out our data set is. Range Interquartile Range (IQR) eg. Find the Range and the Interquartile Range for the data set: 13, 14, 18, 20, 23, 28, 30 Range = 30 – 13 = 17 IQR = 28 – 14 = 14 = Maximum Value – Minimum Value = Q 3 – Q 1 IQR gives a good indication of spread when we have small or large values that may not best reflect our data set

51 Now Do Worksheet 3 Question 2

52 Outliers Some data sets include large or small values that don’t match the rest of the data. This can sometimes give us values for measures of centre and measures of spread that isn’t the best representation of the data. A value is considered an ‘outlier’ if it’s: Value is Less than Q 1 – (1.5 x IQR ) Value is Greater than Q 3 + (1.5 x IQR)

53 Outliers eg. Decide if the following data set includes an outlier. 1, 5, 6, 7, 7, 9, 16 Step 1: Find Q1 and Q3 Q1 = 5, Q3 = 9 Step 2: Find the IQR IQR = 9 – 5 = 4 A value is considered an ‘outlier’ if it’s: Value is Less than Q 1 - 1.5 x IQR Value is Greater than Q 3 + 1.5 x IQR

54 Now Do Worksheet 3 Question 3 Then Exercise 9B – Q1, 2, 3, 4b, 4d, 5, 6, 7, 8, 9a, 9c, 10

55 Boxplots

56 Boxplots are always drawn to scale with a ruled, labelled axis at the base of the plot Boxplots Scale Xmax Xmin Q1 Q3 Median

57 Boxplots eg. A set of data gives the 5-figure summary 2, 5, 9, 13, 18. Represent this using a boxplot. 18 2 5 13 9

58 Boxplots eg. Draw the boxplot for the data set: 3, 4, 4, 5, 6, 6, 7, 9, 11, 12, 15 Are there any outliers? 15 3 4 11 6 No outliers

59 Boxplots eg. Draw the boxplot for the data set: 2, 3, 5, 8, 9, 9, 10, 10, 13, 20 Are there any outliers? Outlier = 20 x Lets see how we can use a calculator to plot these

60 Boxplots eg. Draw the boxplot for the data set: 2, 3, 5, 8, 9, 9, 10, 10, 13, 20

61 Boxplots Use zoom ‘Box’ to get a better view of the plot To see the points on the plot, use Analysis ‘Trace’. Use your arrow keys to move from point to point etc..

62 Parallel Boxplots We can easily compare sets of data using parallel boxplots. These consist of two of more boxplots drawn together using the same scale. Given the parallel boxplots above Given the parallel boxplots above: What statistical measures do they have in common? Which group of data A or B is most spread out? Which group has the largest Q1 value? What is it? Same values for med (14) and Q3 (17) Group B – Largest Range and IQR Group A - 13

63 Now Do Exercise 9C Q1, 2, 3, 4a, 4c, 5a, 5c, 6, 7, 9, 11

64 Time Series Data A time series is a sequence of data values that are recorded at regular time intervals. The data is something meaningful that we monitor over a period of time, such as: Temperature monitored every hour throughout the day Monthly Average Temperature monitored throughout the year Share price fluctuations monitored hourly/daily/monthly etc. The time component is drawn on the x-axisData is plotted on the graph as dots, Joined together with lines

65 Linear – Straight (or almost straight) line Non-Linear (Curve) – Data forms a curve No Trend – Data fluctuates. Describing Trends

66 eg. The plot below shows the change in population of a country town from 1990 to 2005. a)What is the population in the year 2000? b)What is the lowest population recorded? c)State the trend of the data The population declines steadily for the first 9 years, before rising and falling in the final 5 years, resulting in a slight upward trend. Describing Trends 800 700

67 eg. A company’s share price over 12 months is recorded each month, given on the table below. a) Plot the time series graph of the data (start your y-axis data at $1.20).

68 b) Describe the way the share price has changed over the year. The share price generally increased from January to June (from $1.30 to a peak of $1.43), with a small drop of $0.01 in April. After June, the price declines steadily to a low of $1.22 before trending upward to $1.23 in December

69 Now Do Exercise 9D Q1, 2, 3, 5, 6, 7

70 Bivariate Data Bivariate Data involves comparing data that includes two variables. We analyse the data by plotting the data on scatterplot. We look at the direction and shape of data on the plot and from this we can state the strength of the relationship between the two variables – we call this the ‘correlation’.

71 Positive Correlation What does this look like? Strong Positive Weak Positive

72 Negative Correlation What does this look like? Strong Negative Weak Negative

73 No Correlation What does this look like?

74 Bivariate Data eg. Draw the scatterplot for the data and comment on the correlation of the data x112345678891011 y1091113151718 2019222425

75 We look at the pattern that the points have made – the dots could form a straight line in a positive direction, so we can say the data has a strong positive correlation

76 Now Do Exercise 9E Q1, 2a, 3, 4, 5a, 6, 7, 9

77 Line of best fit When bivariate data has a strong linear correlation, we can model the data with a line of best fit. We fit the line ‘by eye’ to try and balance the data points above the line with points below the line. What does it look like?......

78 Drawing the line of best fit We fit the line ‘by eye’ to try and balance the data points above the line with points below the line.

79 Now Do Worksheet – Part 1 Drawing a line of best fit

80 Writing the equation for line of best fit

81

82 Now Do Worksheet – Part 2 Forming the equation for the line of best fit

83 Using the line of best fit to make predictions

84

85 Now Do Worksheet – Part 3 Making predictions using the line of best fit Then Do Exercise 9F Q1, 2, 4abc, 5, 6


Download ppt "Statistics YEAR TEN MATHS FOR FURTHER SEMESTER TWO."

Similar presentations


Ads by Google