 NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
What does integrated statistical and contextual knowledge look like for 3.10? Anne Patel & Jake Wills Otahuhu College & Westlake Boys High School CensusAtSchool.
Inference AS Every time you over-indulge, your life shortens – expert 11:40 AM Wednesday Dec 19, 2012
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
Jared Hockly - Western Springs College
Mathematics and Statistics A look at progressions in Statistics Jumbo Day Hauraki Plains College 15 th June 201 Sandra Cathcart.
PROBLEM This is where you decide what you would like more information on. PLAN You need to know what you will measure and how you will do it. What data.
Inferential Reasoning in Statistics. PPDAC Problem, question, purpose for investigating Plan, Data, Analyse data, Draw a conclusion, justify with evidence.
Analysis. Start with describing the features you see in the data.
Math Alliance Project 4th Stat Session
1 Chapter 1: Sampling and Descriptive Statistics.
AS Achievement Standard.
“Teach A Level Maths” Statistics 1
Level 1 Multivariate Unit
PPDAC Cycle.
Understanding and Comparing Distributions
Meet the Kiwis…. Population of kiwis… Codes… Species Region GS-Great Spotted, NIBr-NorthIsland Brown, Tok-Southern Tokoeka NWN-North West Nelson, CW-Central.
Jeanette Saunders St Cuthbert’s College Level 7 Statistical Investigations.
Information for teachers This PowerPoint presentation gives some examples of analysis statements. Students own answers will differ based on their choice.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Mayfield – Data Handling Lo: To understand which is the appropriate graph to test each hypothesis. To be able to self analyse and adapt my own work.
Informal statistical inference: Years 10 to 12 Maxine Pfannkuch and Chris Wild The University of Auckland.
Introduction to Statistical Inference Probability & Statistics April 2014.
Report Exemplar. Step 1: Purpose State the purpose of your investigation. Pose an appropriate comparison investigative question and do not forget to include.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
How much do you smoke?. I Notice... That the median for the males is 13.5 cigarettes per day and the median for females is 10 cigarettes per day. This.
90288 – Select a Sample and Make Inferences from Data The Mayor’s Claim.
Formal Inference Multivariate Internal. Introduction This report compares if Auckland or Wellington citizens are more likely to borrow more money. The.
Is there a difference in how males and female students perceive their weight?
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Stage 1 Statistics students from Auckland university Using a sample to make a point estimate.
Heart Rate. Some of the ‘Heart Rate’ Data Take a look at the data so you have some ideas to start with. Get an idea about the type of question you might.
Inference Bootstrapping for comparisons. Outcomes Understand the bootstrapping process for construction of a formal confidence interval for a comparison.
CONFIDENCE INTERVALS: THE BASICS Unit 8 Lesson 1.
PPDAC Cycle.
LIS 570 Summarising and presenting data - Univariate analysis.
I wonder if right handed students from the CensusAtSchool NZ 2009 Database are taller than left handed students from the CensusAtSchool NZ 2009 Database.
Inference 3 Integrating informed contextual knowledge.
Use statistical methods to make an inference. Michelle Dalrymple.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Concept: Comparing Data. Essential Question: How do we make comparisons between data sets? Vocabulary: Spread, variation Skewed left Skewed right Symmetric.
Problem I wonder if year 9 students from the censusatschool 2011 database have longer travel times from home to school than year 13 students from the.
The Data Collection and Statistical Analysis in IB Biology John Gasparini The Munich International School Part II – Basic Stats, Standard Deviation and.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
AP Test Practice. A student organization at a university is interested in estimating the proportion of students in favor of showing movies biweekly instead.
KIWI KAPERS Species Weight(kg)Height (cm) Region/Gender.
Introduction Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data.
Investigating Bivariate Measurement Data using iNZight.
By Joanna Charteris.  posing a comparison investigative question using a given multivariate data set  selecting and using appropriate displays and summary.
Multi-variate data internal 4 Credits. achieved The student: Poses an appropriate comparison question, with or without guidance from the teacher,
Statistic Methods (3.10 – Internal 4 credits)
L2 Sampling Exercise A possible solution.
U4D3 Warmup: Find the mean (rounded to the nearest tenth) and median for the following data: 73, 50, 72, 70, 70, 84, 85, 89, 89, 70, 73, 70, 72, 74 Mean:
Description of Data (Summary and Variability measures)
Inference.
Percentiles and Box-and- Whisker Plots
Writing the executive summary section of your report
Inference credits Making the call……..
Common Core Math I Unit 2: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
1.11 Bivariate Data Credits 3 As91036.
Using your knowledge to describe the features of graphs
Statistics Time Series
Advanced Algebra Unit 1 Vocabulary
Gaining Achieved.
Samples and Populations
Presentation transcript:

 NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING ERROR FOR A BOX PLOT

"Whenever I see I remember "

HOW CONFIDENT ARE YOU NOW?

 OR: HOW WRONG OUR ESTIMATES COULD BE  LAST YEAR (SOME OF) YOU MET THE IDEA OF INFORMAL CONFIDENCE INTERVALS

 Our sample gives us this:  But we need to see this  So we need an idea of how “wrong” we could be: - We could take another sample or an even larger sample - Even better we could take another 1000 samples and find the average of our results - In 202 we used a formula to find an interval that had a high probability of containing our statistic (the median that is)  In 302 we use something called BOOTSTRAPPING to do something similar

Ponyland is a mystical land, home to all kinds of magical creatures. The Little Ponies make their home in Paradise Estate, living a peaceful life filled with song and games. However, not all of the creatures of Ponyland are so peaceful, and the Ponies often find themselves having to fight for survival against witches, trolls, goblins and all the other beasts that would love to see the Little Ponies destroyed, enslaved or otherwise harmed. [1] [1]

How tall is your PONY? Casio Graphics calculator: RandNorm#(5, 150) RanInt#(5,150) Excel (normsinv(rand())*5+150 TI 84+ calculator: randnorm(150,5) Draw your PONY Tell your neighbor about your PONY

I wonder what the mean height is for the Little Ponies at Paradise Estate?  From Level 6 (Year 11) we use means/medians to estimate the population mean; however we know there is too much uncertainty  From Level 7 (Year 12) we use an informal confidence interval to estimate the range where the population mean will lie.  Level 8 (Year 13); you will learn a new analysis tool called Bootstrapping

1. Find the mean and median of your sample first – write it down 2. Shuffle your Ponies 3. Select 1 and record their height in excel 4. Put that Pony back and re-shuffle 5. Select another Pony 6. Repeat process until you have recorded 11 Pony heights THIS IS YOUR SAMPLE OF 11 Find the mean and median your sample Plot your mean and median on the board using the appropriate colour

 Good News: Bootstrapping will be the easiest part of your the Inference assignment

Using iNZight to re-sample  Start iNZight and select the Bootstrap Confidence Interval Construction VIT module.  Import the Pony sample session 1 file.  Drag Height down to the variable 1 box, and then click the Analyse tab.  The default quantity is “mean”. Do NOT change this, just click on “Record my choices”  Play, and replicate what you have just done by hand. Check you know what each selection does.  To finish, copy and paste the Bootstrap distribution of re-sample means into a word document.

Using iNZight to check how well this method works  Start iNZight and select the Confidence interval coverage VIT module (or select FILE and VIT modules).  Import the Pony height population file.  Drag “Height” down to the variable 1 box, and then click the Analyse tab.  The default quantity is mean. Do NOT change this. Change the CI Method to bootstrap: percentile and the Sample Size to 10, then click on Record my choices.  Play. Check you know what each selection does, and how it relates to the bootstrap confidence intervals. Just remember: You will rarely have data on the whole population! This is just a teaching tool to show you how it works!

 posing a comparison investigative question using a given multivariate data set  selecting and using appropriate displays and summary statistics  discussing sample distributions  discussing sampling variability, including the variability of estimates  making an appropriate formal statistical inference  communicating findings in a conclusion.

 Achieved - Use statistical methods to make a formal inference involves showing evidence of using each component of the statistical enquiry cycle.  Merit - Use statistical methods to make a formal inference, with justification involves linking components of the statistical enquiry cycle to the context, and referring to evidence such as sample statistics, data values, or features of visual displays in support of statements made.  Excellence - Use statistical methods to make a formal inference, with statistical insight involves integrating statistical and contextual knowledge throughout the statistical enquiry cycle, and may include reflecting about the process; considering other relevant explanations.

PROPLEM AND PLAN STAGE

I wonder what the difference is between the median weight of forward and back rugby players in New Zealand according to a sample from sidestep-central.com/ sidestep-central.com/ What you are comparing (you must include the mean or median) The characteristic you are grouping by What the population is Where your sample data is sourced from The weight is the weight of the rugby players in kilograms, and the position is the player’s normal position on the rugby field, either forward or back. What next? What did we do for the BIVARIATE standard?

 I am doing this investigation as I play rugby and it has often been commented that I would better be suited to playing back due to my size and weigh. I wonder how my weight compares to the median ….. I would expect the median weights for backs to be less than forwards although ……

Basic facts One thing I didn’t know One thing I found interesting As you may know the link between autism and vaccines has a long and contentious history. Use this topic to do some research into this area. The table below may help you summarise your findings. Come up with AT LEAST two different questions I DO NOT want you to spend much time on this Autism and Vaccines

SUCCOS S SPREAD Discuss the Inter Quartile Range (IQR) – which is UQ – LQ This is the spread of the middle 50% U UNUSUAL FEATURES This is usually seen by looking at the raw data (dot plot) OR a long whisker C CLUSTERSWhere does most of the data lie between OR any groupings? C CENTRE Compare the middle 50% of the data and which is higher up the scale O OVERLAPIs there a visible overlap of the boxes? S SHAPE Is there an even distribution? – median in the middle of the box and whiskers even in length

 SPREAD  Comparing the sizes of the spreads What do you see? What does this mean for the sample? What does this mean for the population? The inter quartile range for the forwards is 12.2 kg whereas the interquartile range for the backs is 7.5 kg. The range is also greater for the forwards than the backs. The standard deviation is also higher for the forwards. This indicates that the forwards have more variation in their weights than the backs. Overall visually forwards seem to be slightly more spread out than backs.

 UNUSUAL  Describing any unusual features What do you see? What does this mean for the sample? What does this mean for the population? Looking at the graphs I can see that the forwards have one player that weighs more than most of the other forwards. He is a New Zealander weighing 137 kg and is 1.81 m tall. This could be because he is a stockier player that is quite large with more muscles causing him to weigh more, which is what I would expect are characteristics a forward is more likely to have. {research?}

 CENTRE  Comparing the middle 50% What do you see? What does this mean for the sample? What does this mean for the population? The forwards’ median weight is kg higher than the backs’ median weight. The middle 50% of the forward’s weights are between kg and kg whereas the middle 50% of the back’s weights are between 88.0 kg and 95.5 kg. Remember this structure is only a guide

 CLUSTERS  Where does most of the data lie between and are there any groupings? What do you see? What does this mean for the sample? What does this mean for the population? There are two discernable groups for the forwards, one between 97kg-105kg and the other between 115kg and 120kg. This could be due to the heavier group being props and the lighter group being flankers. (If we have access to the raw data we could actually find this out!) This is a great opportunity to integrate some research. What can we find out about the weight of props and flankers on the internet?

 OVERLAP  Where does most of the data lie between and are there any groupings? What do you see? What does this mean for the sample? What does this mean for the population? The lower quartile for the forwards weight is higher than the upper quartile of the weight of the backs Therefore the middle 50% do not overlap. This suggest that weights for forwards will be higher on average than the weights for backs

 SHAPE  What is the distribution like? What do you see? What does this mean for the sample? What does this mean for the population? The forwards weights appear to have two distinct groupings and be skewed to the right whereas the backs weights seem reasonably symmetrical. This means that the weights of the backs are more evenly spread out but cluster around the median following an almost normal distribution. The forwards however have weights that are more variable with two distinct groupings and a particularly heavier player who skews the data to the right. Backs appear to be unimodal whereas the forwards are potentially bimodal. However there is only one player skewing the data to the right so this could be down to sampling variability.

Open run mode Import data Chose your Variable 1 (has to be numerical) Subset by your two groups Import ‘Student Data’ and draw a comparison B & W for the head perimeter between males and females. Get summary Statistics ** Data is based on Year 11 students at Blah College

FemaleMale S SPREADAD U UNUSUAL FEATURES C CLUSTERS C CENTRE O OVERLAP S SHAPE

FemaleMale SPREAD: Compare the IQR (middle 50% spread) Female IQR = 58 – 55 = 3 Male IQR = 58 – = 4.25 The middle 50% of head circumferences belonging to the male year 11 students at Blah College are more spread out than the middle 50% of head circumferences of female Year 11 students at Blah College. This is shown by the male head circumference IQR range being larger by This could be because … (possible reason why)

FemaleMale Unusual features/value: There is one unusually small head circumference for year 11 males at Blah College at 46cm whereas there are no unusual head circumferences for females at Blah College. This could be because … (possible reason why)

FemaleMale Clusters: Most of the head circumferences for Year 11 females at Blah College are between 53cm and 58cm whereas most of the head circumferences for the Year 11 males at Blah College are between 54cm and 58cm. There also seems to be two groupings of Year 11 female students with a head circumference of 57cm and 55cm, whereas the male year 11 students seem to be more scattered with no clusters. This could be because … (possible reason why)

FemaleMale Centre: Expectation is to compare the middle 50% Female middle 50% = 58 and 55cm median = 57cm Male middle 50% = 58 and 53.75cm median = 55cm The median head circumference for year 11 female students at Blah College is 2cm bigger than the male Year 11 students at Blah College. The middle 50% of year 11 female students at Blah College is between 55 and 58cm, which is approximately the same as the year 11 male students at Blah College. For example the middle 50% of students have roughly the same head circumference no matter if you were male or female. This could be because … (possible reason why)

FemaleMale Overlap: Does the boxes (middle 50%) overlap?? Female middle 50% = 58 and 55cm Male middle 50% = 58 and 53.75cm There is significant overlapping of the middle 50% between male and female year 11 students at Blah College which suggests that we may not be able to make a call whether there is a difference in head circumferences between male and female. This could be because … (possible reason why)

FemaleMale Symmetry: Both male and female students at Blah College have asymmetric distributions meaning there is an uneven distribution. This is because the head circumferences for female year 11 students at Blah College have been pushed slightly towards having larger head circumferences whereas the males have been pushed slightly towards having smaller head circumferences skewing both distributions. This could be because … (possible reason why)

I wonder what is the difference between the mean wing length of Male Pegasus18 years or over and Female Pegasus Ponies that are 18 years of age or over Draw a comparison box and whisker graphs on the wing length of Pegasus Ponies at Paradise Estate Describe any features.

Comment on the sample distribution for your TWO investigation questions Heights Spike Copy and paste ANY relevant graphs and/or statistics you have used. Describe the features

I am fairly confident that there is a difference between the mean wing length of female and male Pegasus Ponies that are 18 years or over. I can make the call that Males have longer wings than females as the bootstrap values are both positive. I can also say that Male Pegasus Ponies 18 years and over have a mean wing length that is somewhere between 1.534cm and 3.595cm larger than the mean female length. I wonder if there are any differences between the mean wing length of Male and Female Pegasus Ponies that are 18 years of age or over

Answer both of your comparison questions 1. Open iNZight in bootstrap VIT mode 2. Inport appropriate data 3. Show bootstrap distibution 4. Calculate confidence interval 5. Write a inference. Remember We want to create a bootstrap confidence interval for the difference between median heights of female ponies and median heights of male ponies. We want to create a bootstrap confidence interval for the difference in median heights between the ponies chased by Spike and the ponies not chased by Spike.

Complete this sheet in student resources

Make a formal statistical inference. Conclude your investigation, reflecting on your hypothesis and justifying your formal inference This may include: -Discussing sampling variability, including the variability of estimates. -Reflecting on the process you have used to make the formal inference -Discussing your choice of the mean or median -Are there any lurking variables that you could consider next to improve your investigation?

I wonder if there are any differences between the mean wing lengths of Male and Female Pegasus Ponies that are 18 years of age or over When looking at the sample variation between male and females, females wing lengths are a lot more spread out than males. However when you compare just the middle 50% spread they only have a difference of 0.65cm which is very small. This leads me to believe that if I had a different sample the spread could potentially be different where there may not be as many female Pegasus Ponies with short wings. If this was the case, this would push up the mean, but may have little effect on the median. Through my research about Pegasus Ponies wings I have learnt that female Pegasus Ponies have a different shape of wing as they are narrower, so looking just at the length of the wing may not be enough to make a recommendation about whether to make special female army wing guards. When looking at the standard deviation there is not much variation between the difference of means. The bootstrap interval is also significantly more than 0 which gives me confidence that there is indeed a difference in mean wing lengths. Based on my investigation and the sample that I was given, I would conclude that there is a difference between male and female wing lengths for all Pegasus Ponies that are 18 years and over. I therefore make the recommendation that they should be making special wing guards for females. Copy and paste question into your conclusion

Because our Ponies are fictional we are not going to write a conclusions based on this.

Research