 Small liberal arts college: 1350 undergraduate students  Statistician within Department of Math, Stat and CS  Class size: Stat 131 (30 students),

Slides:



Advertisements
Similar presentations
READINESS CRITERIA What does it mean to be ready to do a major course redesign? Is your institution ready? Which courses are readyi.e., are good candidates.
Advertisements

READINESS CRITERIA What does it mean to be ready to do a major course redesign? Is your institution ready? Which courses are readyi.e., are good candidates.
2012 Pre-STEM and STEM Summit 1. A Short Video to Start the Day 2.
East Midland Reablement Evaluation Tool EMRET A PROGRAMME OF WORK RE ITS VIABILITY.
First-Year Student Success: In Search of Best Practice Randy L. Swing, Ph.D. Co-Director, Policy Center on the First Year of College Fellow, National Resource.
The Department for Education Policy summary: Improving the quality and range of education and childcare from birth to 5 years February 2014.
Implementation and Order of Topics at Hope College.
Office of Institutional Research Song Yan, Kristy Maxwell, Mark A. Byrd Associate Director Senior Research Analyst AVP Wayne State University.
Dr. Summer A. Carrol Director, Master of Arts in Teaching Program
Randomization workshop eCOTS May 22, 2014 Presenters: Nathan Tintle and Beth Chance.
MAY 22, 2014 ECOTS WORKSHOP PRESENTERS: NATHAN TINTLE AND BETH CHANCE HOUR #2 Randomization workshop.
An Active Approach to Statistical Inference using Randomization Methods Todd Swanson & Jill VanderStoep Hope College Holland, Michigan.
General Education Assessment 2013: Introduction
Faster IS Better: Accelerating to Success Kay Teague And Michael Warren.
Inquiry, the Scientific Method and Nature Investigations.
MT. PLEASANT A CLOSER LOOK AT ACADEMIC PERFORMANCE.
IAPT-SMI Demonstration Site for Psychosis Professor Philippa Garety
Daniel Peck January 28, SLOs versus Course Objectives Student Learning Outcomes for the classroom describe the knowledge, skills, abilities.
My problem is students don't read graphs and tables effectively. Research Question Does continued practice producing and interpreting graphs increase students’
A. John Bailer Statistics and Statistical Modeling in The First Two Years of College Math.
QUANTITATIVE EVIDENCE FOR THE USE OF SIMULATION AND RANDOMIZATION IN THE INTRODUCTORY STATISTICS COURSE Nathan Tintle Associate Professor of Statistics.
Elizabeth Fry and Rebekah Isaak University of Minnesota
A new approach to introductory statistics Nathan Tintle Hope College.
Review of the research on educational usage of games PhD student Simon Egenfeldt-Nielsen IT-University Copenhagen Game-research.com 4. December 2003, ITU.
Impact of Alternative Introductory Courses on Programming Concept Understanding Allison Elliott Tew W. Michael McCracken Mark Guzdial College of Computing.
Larry Weldon Simon Fraser University. Service vs Mainstream? A troublesome dichotomy! “mainstream” students need to understand applications “service”
CS 1 with Robots CS1301 – Where it Fits Institute for Personal Robots in Education (IPRE)‏
Making a Difference: Application of SoTL In and Beyond the Classroom to Enhance Learning Presentation Indiana University September 2011 Kathleen McKinney.
How Faculty Use NSDL Michael Khoo, Evaluator, NSDL Core Integration BEN Workshop, Washington D.C., December 2006.
Security 1  26 Modules  CS0, CS1, CS2 o Buffer Overflow o Integer Error o Input Validation  Computer Literacy o Phishing o Cryptography.
STAT E100 Section Week 10 – Hypothesis testing, 1- Proportion, 2- Proportion – Z tests, 2- Sample T tests.
RESEARCH TEAM: ASSEMBLE! Brandon Hanson 2013 AP Statistics Reading Best Practices.
University of Maryland Baltimore County Department of Psychology Psyc100: Introductory Psychology Eileen O’Brien, Ph.D.
How to Handle Intervals in a Simulation-Based Curriculum? Robin Lock Burry Professor of Statistics St. Lawrence University 2015 Joint Statistics Meetings.
Maximizing Learning Using Online Assessment 2011 SLATE Conference October 14, /12/ P. Boyles, Assistant Professor, Chicago State University,
Rethinking Pre-College Math: A Brief Reminder about Why We’re Here and What We’re Trying to Do Overall context/purpose of project Defining characteristics.
Improving the Appeal of Computing Gabriel J. Ferrer Hendrix College
Research Problem From historical anecdotal evidence from colleagues, as well as from my own subjective, informal observations, students have a particularly.
Quantitative Methods in Geography Geography 391. Introductions and Questions What (and when) was the last math class you had? Have you had statistics.
Integrating Data Analysis Across the Curriculum Feel free to edit and change this slide.
What’s Right with Undergraduate Statistics? Exciting Course Options.
8.1 Testing the Difference Between Means (Independent Samples,  1 and  2 Known) Key Concepts: –Dependent vs Independent Samples –Sampling Distribution.
Increasing Gateway Course Completion by Scaling Successful Statewide Initiatives Saundra King Assistant Vice President of Remediation and Innovation April.
Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.
Monitoring and Evaluation of Electronic Resource Use Unit 1.0: Introduction.
Partnership and Technology Opportunities in Redesign.
GTES-CS Georgia Tech Emerging Scholars in Computer Science.
The Gold Standard… Faculty are Key.  Annual Assessment based on  Address each SLO  Be specific, measurable, student- focused  Align the new.
Reflections on making the switch to a simulation-based inference curriculum Panelists: Julie Clark, Lacey Echols, Dave Klanderman, Laura Shultz Moderator:
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Implementing Practices That Lead to Institutional Change: Faculty Development Kaci Thompson, University of Maryland Joe Watkins, University of Arizona.
AAC&U Members on Trends in Learning Outcomes Assessment Key findings from a survey among 325 chief academic officers or designated representatives at AAC&U.
3 STUDENT ASSESSMENT DEPARTMENT
CS 1 with Robots CS1301 – Where it Fits Institute for Personal Robots in Education (IPRE)‏
Welcome to AP Stats!. The AP Exam Thursday, May12, This is during the second week of AP testing and about 4 weeks after Spring Break. The TEST:
Making Changes with Technology in Mathematical Statistics JENNIFER L. GREEN, MONTANA STATE UNIVERSITY JENNIFER BROATCH, ARIZONA STATE UNIVERSITY-WEST.
One Team. One Vision. Unlimited Success Gerald Oehler Old Court Middle School
SST UG2 Exam Revision The exam will be ● closed book ● 2 hours plus 5 minutes reading time ● 2 sections A and B each with 3 questions ● Answer 2 questions.
Accounting 9706.
Simulation Based Inference for Learning
Welcome to AP Stats!.
Assessing the association between quantitative maturity and student performance in simulation-based and non-simulation based introductory statistics Nathan.
Stat 217 – Day 28 Review Stat 217.
Procedures & Collecting Data
Significance Tests: The Basics
Sampling Distribution Models
Assessing the association between quantitative maturity and student performance in simulation-based and non-simulation based introductory statistics Nathan.
Computer Engineering Department Islamic University of Gaza
Inquiry in the Science Classroom:
Presentation transcript:

 Small liberal arts college: 1350 undergraduate students  Statistician within Department of Math, Stat and CS  Class size: Stat 131 (30 students), 5-6 sections per year  3 hours per week in computer or tech- enabled classroom

 What we know about randomization approaches  What we don’t  What it means

 Tintle et al. flavor (2013 version) ◦ Unit 1. Inference (Single proportion) ◦ Unit 2. Comparing two groups  Means, proportions, paired data  Descriptives, simulation/randomization, asymptotic ◦ Unit 3. Other data contexts  Multiple means, multiple proportions, two quantitative variables  Descriptives, simulation/randomization, asymptotic

 Qualitative ◦ Momentum:  Attendance at conference sessions, workshops  Publishers agreeing to publish the books  Class testers/inquiries  People doing this in their classrooms (clients, colleagues)  Repeat users  Appealing “in principle” and based on testimonials to date

 Quantitative assessment  Tintle et al. (2011, 2012) ◦ Compare early version of curriculum (2009) to traditional curriculum at same institution as well as national sample ◦ 40 question CAOS test ◦ Results  Better student learning outcomes in some areas (design and inference); little evidence of declines

Post-test answerNational sampleHope -2007Hope-2009 Small p-value68%86%96% Sample sizes: Hope ~200 per group; National Sample 760 P<0.001 between cohorts Pre-test: 50-60% correct Example #1. Proportion of students correctly identifying that researchers want small p-value’s if they hope to show statistical significance

 results 14 instructors, 7 institutions Total combined sample size of 783

Instructor (Inst, Class size) Pre-testPost-testChangeSample size 1 (LA, Med)70%97%27%33 2 (LA, Med)73%95%22%26 3 (Univ, Med)23%95%72%40 4 (LA, Med)70%96%26%127 5 (LA, Sm)28%92%64%11 6 (Univ, Med)37%96%59%49 7 (Univ, Sm)39%73%34%23 8 (LA, Med)60%97%37%35 9 (LA, Med)29%96%67%95 10 (HS, Med)24%74%50%38 11 (Univ, Large)68%97%29% (LA, Med)63%93%30%92 13 (LA, Med)28%95%68%18 14 (LA, Med)56%97%41%78

 Institutional diversity in student background (pre-test)  Post-test performance very good for most (over 90%)  A couple of exceptions ◦ Both first time instructors with curriculum who will use it again this year

 Example 1 (continued).  First quiz, 2.5 weeks into course; Simulation for a single proportion  119 people played RPS, 11.8% picked scissors  Evidence that scissors are picked less than 1/3 of time in long run?

 The following graph shows the 1000 different “could have been” sample proportions choosing scissors for samples of 119 people assuming scissors is chosen 1/3 of the time in the long run.

 Would you consider the results of this study to be convincing evidence that scissors are chosen less often in the long run than expected? No, the p-value is going to be large8% No, the p-value is going to be small2% Yes, the p-value is going to be small77% Yes, the p-value is going to be large9% No, the distribution is centered at 1/3.4%

 Suppose the study had only involved 50 people but with the same sample proportion picking scissors. How would the p-value change? It would not change, the sample proportion was the same 22% It would be smaller11% It would be larger66% Not enough information1% Single instructor (me), on 92 students, across 4 sections and 2 semesters

 Example #2. Moving beyond a specific item to sets of related items and retention  Tintle et al (SERJ)+JSE ◦ Improvement in Data collection and Design, Tests of significance, Probability (Simulation) on post-test ◦ Data collection and Design and Tests of significance improvements were retained significantly better than in consensus curriculum

 Retention significantly better (p=0.02)

 Example #3. How are weak students doing?

 GroupPre-testPost-testChange Lowest (n=210; 13 or less) 38%55%17% Middle (n=329; 14-17) 52%60%8% Highest (n=250; 18+) 66%69%3% All changes are highly significant using paired t-tests (p<0.001) **Among those who completed course; anecdotally we’re seeing lower drop out rate now than with consensus curriculum

 Example #4. Understand new data contexts?  Old AP Statistics question 10 randomly selected laptop batteries; tested and measured hours they lasted

 To investigate whether the shape of the sample data distribution was simply due to chance or if it actually provides evidence that the population distribution of battery lifetimes is skewed to the right, the engineers at the company decided to take 100 random samples of lifetimes, each of size 10, sampled from a perfectly symmetric normally, distributed population with a mean of 2.6 hours and standard deviation of 0.29 hours. For each of those 100 samples, the statistic sample mean divided by the sample median was calculated. A dotplot of the 100 simulated skewness ratios is shown below.

 What is the explanation for why the engineers carried out the process above? This process allows them to determine the percentage of the time the sample distribution would be skewed to the right 3% This process allows them to compare their observed skewness ratio to what could have happened by chance if the population distribution was really symmetric/normally distributed. 64% This process allows them to determine how many times they need to replicate the experiment for valid results 10% This process allows them to compare their observed skewness ratio to what could have happened by chance if the population distribution was really right skewed. 23%

 Analysis of all (free-response) class tests is ongoing  Integrate observed statistic and simulated values to draw a conclusion?

 Summary ◦ Preliminary and current versions showed improved performance in understanding of tests of significance, design and probability (simulation) post-course, and improved retention in these areas ◦ These results appear stable across lower- performing students with older and newer versions of the curriculum ◦ Some evidence of student ability to apply the framework of inference (3-S) to novel situations

 Summary ◦ Some instructor differences, but also preliminary validation of “transferability” of findings across different institutions/instructors; new instructors? ◦ **Note: Some evidence of weaker performance in descriptive stats in this earlier curriculum; substantial changes to descriptive statistics approach to combat this.

 What’s making the change ◦ Content? ◦ Pedagogy? ◦ Repetition?  How much randomization before you see a change?  Are there differences student performance based on curricula? Are they important?

 What are the developmental learning trajectories for inference (Do they understand what we mean by ‘simulation’)? Other topics?  Low performing students; promising---ACT, GPA  Does improved performance transfer across institutions/instructors? What kind of instructor training/support is needed to be successful?  Using CAOS (or adapted CAOS) questions, but do we still all agree these are the “right” questions? Is knowing what a small p- value means enough? What level of understanding are they attaining?  Why do students in both curriculums tend to do poorly on descriptive statistics questions? Or areas where we see little difference in curricula?

 Preliminary indications continue to be positive  You can cite similar or improved performance on nationally standardized/accepted/normed tests for the approach  Tag line for peers and clients: ◦ We are improving some areas (the important ones?) and doing no harm elsewhere  Still lots of room for better understanding and continued improvement of approach  Student engagement (talk yesterday)  Next steps: Larger, more comprehensive assessment effort coordinated between users of randomization-based curriculum and those that don’t. If you are interested let me know.

 Author team (Beth Chance, George Cobb, Allan Rossman, Soma Roy, Todd Swanson and Jill VanderStoep)  Class testers  NSF funding

 Tintle NL, VanderStoep J, Holmes V-L, Quisenberry B and Swanson T “Development and assessment of a preliminary randomization-based introductory statistics curriculum” Journal of Statistics Education 19(1), 2011  Tintle NL, Topliff K, VanderSteop J, Holmes V-L, Swanson T “Retention of statistical concepts in a preliminary randomization-based introductory statistics curriculum” Statistics Education Research Journal, 2012.