Statistics & Probabilities Grade 9 Pre IB
Data Types and Representation Continuous Numerical Data In this type a continuous range represents the continuous values that the data can take Discrete Numerical Data In this kind a discrete range represents the discrete values that the data can take
Continuous Numerical Data A continuous numerical variable can theoretically take any value on the number line Example: The weight of pumpkins harvested by Salvi in kilograms was recorded as: 2.1,3.0, 0.6, 1.5, 1.9, 2.4, 3.2, 4.2, 2.6, 3.1, 1.8, 1.7, 3.9, 2.4, 0.3, 1.5, 1.2
Continuous Numerical Data The data is continuous because the weight could be any value from 0.1 kg up to 10 kg The range of weights recorded is: 0.3 to 4.2 This data can be represented by the following graph
Continuous Numerical Data Weight (Kg) Frequency 0 < 12 1 < 26 2 < 34 3 < 44 4 < 51
Discrete Numerical Data Data is made up of individual observations of a variable Discrete numerical variable can only take distinct values which we find by counting
1.Consider the set of Math, Science, History, and English courses and collect the following data 2.How many of you like Math the best? 3.How many of you like Science the best? 4.How many of you like History the best? 5.How many of you like English the best? 6.Now we draw a graph representation of the collected data on a chart: Discrete Numerical Data Example
Discrete Numerical Data
Categorical Data To find differences between numerical and categorical data answer the following questions: How many pets do students in our class have? How many hours a week do you spend watching TV? What is your favourite sport? What kind of music do you like best? How many hours a week do you talk on the phone? What kinds of snacks do you like? How much do our backpacks weigh? How much candy do we eat each week?
Statistics In today’s fast changing and moving computer age we collect vast quantities of data Math is concerned with how data is collected, organized, presented, summarized and then analyzed
Statistics Statistics is the branch of mathematics that deals with the collection, organization, and interpretation of data These data are usually organized into tables and/or presented as graphs Some common types of graphs are:
Statistics Showing by Graphs Pictograph A graph that uses a symbol or an image to represent a certain amount
In a circle graph a complete set of data is presented by the circle Various parts of the data are represented by the sectors of the circle This method is useful when showing data as a percentage or as a fraction of the entire data space is needed Statistics Showing by Graphs Circle Graph
Example of Circle Graph Math grade 9 Mark distribution
The graph uses vertical bars to represent different segments of the data, and it is used for discrete data As an example the students favorite fruit juice color can be represented by a bar graph Statistics Showing by Graphs Bar Graph
A graph that uses bars to represent the frequency (or number) of the data within a range of values It is used for continuous data As an example the distribution of salaries of employees in a company can be represented by a histogram Statistics Showing by Graphs Histogram Graph
Plots different data values on the y-axis The only points on a broken-line graph that represent data are the endpoints of the segment The adjacent points are joined by a line segment The exact value is not clear between the points Statistics Showing by Graphs Broken-Line Graph
As an example, mass of a rabbit at different months of the year is plotted
This graph shows the value of one variable corresponding to the value of another variable for all values over a given interval All the points on a continuous-line graph correspond to data As an example the following graphs shows the distance required to bring a car to rest from the moment the brakes are applied versus the car speed up to 100 km/h Statistics Showing by Graphs Continuous-Line Graph
Using Scatter Plot Scatter plot is a graphic tool used to display the relationship between two quantitative variables A scatterplot uses X and Y axis and series of dots Each dot represents one observation from a data point The position of the dots on the scatter plot represents its X and Y values
Scatter Plot Example Weight versus Height of Basketball Players Height (Inches)weigh (Pounds)
Interpreting the plot Each player is represented by a dot on the scatter plot The first dot represents the shortest player This plot suggests that relationship between height and weight can be approximately modeled by a linear line with a positive slope
Activity: Scatter Plot 1. Measure your hand span 2. Measure your height 3. Gather the data from all members of your class and put it in a table 4. Choose one variable as the independent variable and the other as the dependent variable, draw a scatter plot to represent the data and then in a few sentences interpret your data!
Scatter Plot 1. Would you say the variables are continuous or discrete? 2. Are there any data points that don’t fit the pattern? If so, explain. 3. How does the scatter plot suggest how hand span and height are related?
Class Arm-Span vs Height
Stem-and-Leaf Plot This kind of plot is used in organizing a discrete numerical data All of the actual data values are shown The minimum or smallest data value is easy to find The maximum or largest data value is easy to find The range of values that occurs most often is easy to see The shape if the distribution of the data is easy to see
Exercise for the Stem-Leaf Plot The score for test out of 50 was recorded for 36 students: 25, 36, 38, 49, 23, 46, 47, 15, 28, 38, 34, 9, 30, 24, 27, 27, 42, 16, 28, 31, 24, 46, 25, 31, 37, 35, 32, 39, 43, 40, 50, 47, 29, 36, 35, 33 Organize the data using a stem-and-leaf plot What percentage of students scored 40 or more marks?
The Solutions and Plot The stem will be 0, 1, 2, 3, 4, 5 Unordered Stem-Plot StemLeaf
Stem-and Leaf Plot Ordered Stem-plot : 9 Students scored 40 or more marks and it is 9/36 X 100% which is equal to 25% StemLeaf
Central Tendency Central Tendency refers to the middle value and mean, median and mode are used to measure it Which one represent the central tendency depends on the situation Mean is influenced by extreme values in the data set ( outliers) Median is not influenced by extreme values in the data set ( outliers) Mode is referred to the values that occur the most Investigate when to use Mean, Median or Mode?
The Mean Looking at the middle or center of the data set and measuring its spread give a better understanding of the data set The mean of a data set is the statistical for its arithmetic average. It can be found by dividing the sum of the data values by the number of data values
Exercise, Finding Mean The table below shows the numbers of aces served by tennis players in their set of the tournament Determine the mean number of aces for these sets Number of aces Frequency
No of AcesFerquencyProduct Total55179 Mean = 179/55 =3.25 aces Exercise, Finding Mean
The Median The median is the middle of an ordered data set The data set is ordered by listing the data from smallest to largest. The median split the data in two halves If there are n data values, the median is:
Exercise for Median The following sets of data shows the number of peas in randomly selected sample of pods. Find the median for each set. 3, 6, 5, 7, 7, 4, 6, 5, 6, 7, 6, 8, 10, 7, 8 ( 15 of them) The ordered set is: n= 15, (n+1)/2 = 8, the median is the 8 th data value Then the median =6 peas
Exercise for Median Find the median for the following data set: 3,6, 5, 7, 7, 4, 6, 5, 6, 7, 6, 8, 10,7, 8, 9 (16 of them) The ordered data set is: n = 16, (n+1)/2 = 8.5 The median is the average of the 8 th and 9 th data values The median is:
More exercise for Median The data in the table below shows the number of people on each table at a restaurant, find the median of this data: Number of people Frequency The total number of data values is the number of tables in the restaurant. It is the sum of the frequencies, which is n=38 The median is the average of the 19 th and 20 th data values
More exercise for Median Number of people Frequency data values of 8 or less The 14 th to the 25 th are all 9s
Mode The mode is a list of numbers refers to the list of numbers that occur most frequently. A trick to remember this is that mode starts with the same two letters that most does Example: find mode of: 9, 3, 3, 44,15,17,17, 44, 15, 15, 27, 40, 8, put in an order: 3,3, 8, 9, 15,15,15, 17, 17, 27, 40, 44, 44 The mode is 15. there might be more than one mode or none
The Spread of Data To accurately describe a data set, we need not only a measure of centre, but also a measure of its spread Commonly used statistics that indicate the spread of a set of data is: The Range Range is the difference between the maximum or largest data value and the minimum or smallest value Range = maximum data value - minimum data value
Primary and Secondary Source Primary Source is the information that is crated at the first stage and it is the original set of information Secondary Source is information that can be found in different news source and is not original
Example of Scatter Plot Women Olympic Discus Record: the table shows the result from 1948 to 1996 YearWomen (m)
Scatter Plot for the Olympic Discus Result Find a line that best fits the data points
Line of Best Fit (Linear) The line of best fit is a straight line that represent all the data on a scatter plot. In this example the relation between x and y is positive or the slope of the best fit line is positive.
Extrapolation of data One of the application of the line of the best fit is to predict the data beyond its available range As an example the women’s Olympic Discus record for the next Olympic (2016) can be predicted by extending the line of best fit to year 2016 This prediction is called extrapolation Can be predicted to be 85 m
Use the line to predict the future (2016)
Interpolation Let assume that the result of the Women’s Olympics Discus for 1976 was lost and we have the rest Interpolation can help to find the value close to the missing value Let’s look at the scatter plot for the Women’s Olympic Discus without the information for 1976
Interpolation
Line of best fit (Non-Linear)
Survey What does a survey mean? When is it conducted? What does sampling mean? When we have a large amount of data it is often useful to study only a portion of it to gain insight into the complete set of information
Survey When you sip a spoonful of soup to test how hot a bowl of soup is, you are sampling. Based on the temperature of the soup in your spoon, you decide if it is too hot to eat In this case spoon of soup is sample and bowl of soup is the population
Survey All members of the population have an equal chance of being selected. Suppose a survey is conducted to determine the favorite TV program of students in your school Only students in your class are surveyed Since students in other classes have not been asked, the sample is not random sample All students in your school are not represented
Survey Examples Explain why each sample may not provide accurate information about its population 1- A survey of your classmates in used to estimate the average age of students in your school 2- A survey of senior citizen is used to determine the music that is best liked by Canadian 3- to determine the ratio of domestic cars to foreign cars purchased by Canadians, a person records the numbers of domestic cars and foreign cars in the parking lot of the General Assembly Plant in Oshawa, Ontario
Probability Activity If you are given 3 coins, the following table shows the number of possible outcomes you might get As it is shown there are 8 possible equally-likely outcomes The probability of each equally-likely outcome is 1/8 Disregard which coin has tail or head and put together those have 2 tails or 2 heads Try to calculate the probability of these possible outcomes Follow the instruction in the activity and do the activity
0TTT 1TTH 2THT 3THH 4HTT 5HTH 6HHT 7HHH