STAT 101: Data Analysis and Statistical Inference

Slides:



Advertisements
Similar presentations
MGF1106 Math for Liberal Arts Sections Course website: Lecturer: Jim Wooland Lectures: TR 8:00 – 9:15, 101 HCB Labs: Mondays,
Advertisements

Exploratory Data Analysis I
Statistics: Unlocking the Power of Data Lock 5 STAT 250: Introduction to Biostatistics Kari Lock Morgan
STAT 101: Data Analysis and Statistical Inference Professor Kari Lock Morgan
Section 1.1 The Structure of Data. Why Statistics? Statistics is all about DATA  Collecting DATA  Describing DATA – summarizing, visualizing  Analyzing.
Economics 1 Principles of Microeconomics Instructor: Ted Bergstrom.
ECO201 PRINCIPLES OF MICROECONOMICS Instructor: Professor Bill Even Office: 3018 FSB Home page:
STAT 250.3: Introduction to Biostatistics Instructor: Efi Antoniou Introduction.
1 BA 275 Quantitative Business Methods Housekeeping Introduction to Statistics Elements of Statistical Analysis Concept of Statistical Analysis Exploring.
Class 1: Sept. 9 About instructor: Dylan Small, Assistant Professor, Department of Statistics. How I got interested in statistics?
My Policies and Some Advice for Doing Well in this Course.
Dr. Tatiana Erukhimova [year] Overview of Today’s Class Folders Syllabus and Course requirements Tricks to survive Mechanics Review and Coulomb’s Law.
Math 260 Hybrid(5518) Orientation–Summer 2015 Monday, June 15, 2015, 11:00-12:30 Instructor: Anne Siswanto Website: Course.
STA 320: Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Welcome to Biology 102! Please pick up a syllabus (if you don’t have one yet) and a clicker at the front desk. You will need to rent a clicker from the.
CHEMISTRY Professor Richard Karpeles. Spring 2014 Chemistry 2 (84.122) Dr. Richard Karpeles Olney Hall 502A (978)
Synthesis and Review 3/26/12 Multiple Comparisons Review of Concepts Review of Methods - Prezi Essential Synthesis 3 Professor Kari Lock Morgan Duke University.
Please CLOSE YOUR LAPTOPS, and turn off and put away your cell phones, and get out your note-taking materials. Today’s daily quiz will be given at the.
Welcome to MM207! Unit 1 Seminar To resize your pods: Place your mouse here. Left mouse click and hold. Drag to the right to enlarge the pod.
COMP 111 Programming Languages 1 First Day. Course COMP111 Dr. Abdul-Hameed Assawadi Office: Room AS15 – No. 2 Tel: Ext. ??
Confidence Intervals I 2/1/12 Correlation (continued) Population parameter versus sample statistic Uncertainty in estimates Sampling distribution Confidence.
Using Lock5 Statistics: Unlocking the Power of Data
1 BA 275 Quantitative Business Methods Housekeeping Introduction to Statistics Elements of Statistical Analysis Concept of Statistical Analysis Statgraphics.
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
Welcome to the MTLC MATH 110 Fall Welcome to the MTLC Instructors Sections 01, 03: Jil Chambless Section 05:Nathan Jackson Sections 07, 09: Mary.
Welcome to the MTLC MATH 110 Fall Welcome to Math 110 Instructors Sections 01: P Yao Section 03, 05: M Agwang Section 07, 09: A Acharyya Section.
Lecturer:Prof. Elizabeth A. Ritchie, ATMO TAs:Mr. Adrian Barnard Ms. Anita Annamalai NATS 101 Introduction to Weather and Climate Section 14: T/R 2:00.
CGS-2531 Problem Solving with Computer Software Course home page: Course.
CS 101 Today’s class will start 5 minutes late. CS 101 Introduction to Computer Science Aaron Bloomfield University of Virginia Spring 2007.
381 QSCI Winter 2012 Introduction to Probability and Statistics.
Welcome to Physics 1D03.
Lecture 1: Introduction I am Dr. Rong Fu, your instructor of this class. Welcome to the first class of GEO 302C Climate: Past, Present and Future! Before.
Syllabus Highlights Spring Full syllabus is on myCourses! Exams (3 hourly + 1 final – drop one) : 300 points Attendance 12 points Homework 60 points.
Welcome to CS 115! Introduction to Programming. Class URL Write this down!
Welcome to the MTLC MATH 110 Spring Welcome to Math 110 Instructors Sections 01, 03: Jil Chambless Section 05, 07: Mary Maxwell Section 10:Larry.
Welcome MATH Spring 2011 Instructor: Dr. Larry Bowen.
Quantitative Methods in Geography Geography 391. Introductions and Questions What (and when) was the last math class you had? Have you had statistics.
CSE 113 Introduction to Computer Programming Lecture slides for Week 1 Monday, August 29 th, 2011 Instructor: Scott Settembre.
Inference after ANOVA, Multiple Comparisons 3/21/12 Inference after ANOVA The problem of multiple comparisons Bonferroni’s Correction Section 8.2 Professor.
Psychology Fundamentals (B) Professor Mark Steyvers Course website:
Welcome to the MTLC MATH 115 Spring MTLC Information  Hours of Operation  Sunday:4:00pm – 10:00pm  Monday – Thursday: 8:00am – 10:00pm  Friday:8:00am.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Please CLOSE YOUR LAPTOPS, and turn off and put away your cell phones, and get out your note- taking materials.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/6/12 Synthesis Big Picture Essential Synthesis Bayesian Inference (continued)
Introductory Macroeconomics Lecture 1 Dr. Jennifer P. Wissink ©2015 Jennifer P. Wissink, all rights reserved. August 25, 2015.
Please initial the attendance roster near the door. If you are on the Wait List you will find your name at the bottom. If you are not on the roster, please.
Chemistry 101 Beth Lindquist 7 Chemistry Annex Office Hours: 9-10 am Tuesdays and Thursdays And by appointment.
CSCE 1030 Computer Science 1 First Day. Course Dr. Ryan Garlick Office: Research Park F201 B –Inside the Computer Science department.
STAT 250: Introduction to Biostatistics
Research Experience Program (REP) Spring 2008 Psychology 100 Ψ.
INTRODUCTION: WELCOME TO STAT 200 January 5 th, 2009.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Please initial the attendance roster near the door. If you are on the Wait List you will find your name at the bottom. If you are not on the roster, please.
Probability 4/18/12 Probability Conditional probability Disjoint events Independent events Section 11.1 (pdf)pdf Professor Kari Lock Morgan Duke University.
Financial Mathematics Math 373. Math 373 – Instructor Armstrong 1010 (Tu & Th at 1:30) and Matthews 210 (Th at 9:30) Jeff Beckley –
Where is this place? A Glacier National Park located in the U.S. state of Montana B Mt. Rainier National Park located in the U.S. state Washington B.
BUS 310 Statistics Bill Remus. u TuTh 1:30pm and 3pm u Bill Remus u C502 Office Hours W 1:30-4:30 and By Appointment u Phone: u
Lecture 1: Introduction I am Dr. Zong-Liang Yang, your instructor of this class. Welcome to the first class of GEO 302C Climate: Past, Present and Future!
Welcome to AP Stats!. The AP Exam Thursday, May12, This is during the second week of AP testing and about 4 weeks after Spring Break. The TEST:
Chapter 1 Introduction to Statistics 1-1 Overview 1-2 Types of Data 1-3 Critical Thinking 1-4 Design of Experiments.
電腦圖學 Computer Graphic with Programming
Intro to AP Statistics and Exam
Welcome to AP Stats!.
STAT 250: Introduction to Biostatistics
General Chemistry 1 CHEM 1411 Dr. Lorenz J Bauer
Lecture 0 Course Information
Stat Introduction to Statistical Concepts and Methods
Chapter 1 Data Analysis Ch.1 Introduction
Stat Introduction to Statistical Concepts and Methods
Presentation transcript:

STAT 101: Data Analysis and Statistical Inference Professor Kari Lock Morgan kari@stat.duke.edu

Course Website Course Website: http://stat.duke.edu/courses/Spring14/sta101.001/ Sakai: https://sakai.duke.edu/portal/site/STAT101_Spring14 Syllabus

Lecture Slides Lecture slides will be posted on the course website the day before class The slides posted will NOT be complete (I want you to think during class, so won’t give you the answers to questions posed). You are encouraged to take notes on the slides.

by Lock, Lock, Lock Morgan, Lock, and Lock Textbook Statistics: Unlocking the Power of Data by Lock, Lock, Lock Morgan, Lock, and Lock Purchasing options: Bookstore (new, used) wiley.com (e-book) Amazon.com (new, used, kindle, rent) Wiley Plus (wiley.com): interactive online text (linked videos, practice problems, odd solutions)

Clickers Two options: 1) Purchase an i>clicker remote (any version OK) 2) Get the i>clickerGO app for your smartphone Register your clicker, using NetID, athttp://www.iclicker.com/support/registeryourclicker/ The point of clicker questions is to motivate you to think actively about new material as it is being presented Credit simply for clicking in

Class Year What is your class year? First-year Sophomore Junior Senior

Major Your primary major (or potential future major) best falls under the category… Natural Sciences Arts and Humanities Social Sciences Math/Statistics/CS Other

Support My Office Hours: (in Old Chemistry 216) TA Office Hours: tbd Mon 3 – 4 pm, Wed 3 – 4 pm, Fri 1 – 3 pm TA Office Hours: tbd Statistics Education Center: (in Old Chem 211A) 4 – 9 pm Sunday – Thursday in Old Chem 211A Email: Email your TA or kari@stat.duke.edu

Grade Breakdown Labs 26 points (5%) Homework 50 points (10%) Clickers 24 points (5%) Projects 100 points (20 %) Midterms 150 points (30%) Final Exam 150 points (30%) TOTAL 500 points (Up to 10 extra credit points may be earned.)

Labs Labs are on Thursdays in Old Chem 101, starting tomorrow Statistical software, practice analyzing data Labs will be group based Labs will use all free software: StatKey: lock5stat.com/statkey Other free software: tbd

Homework Weekly homework due, usually on Mondays Point of homework: to LEARN! to make sure you are keeping up with the material to prepare you for projects and exams Graded problems and practice problems Grading Graded on a 6 point scale Lowest homework grade dropped Penalties for late homework

Projects Project 1 Project 2 Individual EDA, confidence intervals, hypothesis tests written report up to 5 pages in length Project 2 with your lab group Regression written report up to 10 pages in length

Exams Midterm Exams: 2/19 and 4/2 in class Final: 4/28 2 – 5pm Exams are mandatory and must be taken at the given time. Make-up exams will not be given. In extreme circumstances (severe illness), midterms may be excused only in advance. In this case the grade will be imputed by regression.

Keys to Success Come to class ready to think and be engaged Come to lab ready to think and be engaged Do the homework, trying it by yourself first Do lots of practice problems Read the textbook Stay on top of the material

Introduction to Data SECTION 1.1 Data Cases and variables Categorical and quantitative variables Using data to answer a question

Why Statistics? Statistics is all about DATA Collecting DATA Describing DATA – summarizing, visualizing Analyzing DATA Data are everywhere! Regardless of your field, interests, lifestyle, etc., you will almost definitely have to make decisions based on data, or evaluate decisions someone else has made based on data

Data Data are a set of measurements taken on a set of individual units Usually data is stored and presented in a dataset, comprised of variables measured on cases

Cases and Variables We obtain information about cases or units. A variable is any characteristic that is recorded for each case. Generally each case makes up a row in a dataset, and each variable makes up a column

Countries of the World Country Land Area Population Rural Health Internet Birth Rate Life Expectancy HIV Afghanistan 652230 29021099 76 3.7 1.7 46.5 43.9 Albania 27400 3143291 53.3 8.2 23.9 14.6 76.6 Algeria 2381740 34373426 34.8 10.6 10.2 20.8 72.4 0.1 American Samoa 200 66107 7.7 Andorra 470 83810 11.1 21.3 70.5 10.4 Angola 1246700 18020668 43.3 6.8 3.1 42.9 47 2 Antigua and Barbuda 440 86634 69.5 11 75 Argentina 2736690 39882980 8 13.7 28.1 17.3 75.3 0.5 Ask: What are the cases? What are the variables?

Intro Statistics Survey Data Ask: What are the cases? What are the variables?

Diet Coke and Calcium Drink Calcium Excreted Diet cola 50 62 48 55 58 61 56 Water 46 54 45 53 Ask: What are the cases? What are the variables?

Data US News and World Report National University Rankings Stock Market Duke Basketball Unemployment Rate Hybrid Cars Public Opinion Antidepressants and Alzheimer’s

Data Applicable to You Think of a potential dataset (it doesn’t have to actually exist) that you would be interested in analyzing What are the cases? What are the variables? What interesting questions could it help you answer? Have students discuss, share answers

Counties with the highest kidney cancer death rates For fun, ask students to hypothesize about kidney cancer risk factors Counties with the highest kidney cancer death rates Source: Gelman et. al. Bayesian Data Anaylsis, CRC Press, 2004.

Counties with the lowest kidney cancer death rates Key: sample size. Smaller counties are more likely to have more extreme death rates – tell them they will learn about this. Counties with the lowest kidney cancer death rates Source: Gelman et. al. Bayesian Data Anaylsis, CRC Press, 2004.

Kidney Cancer If the values in the kidney cancer dataset are rates of kidney cancer deaths, then what are the cases? The people living in the US The counties of the US A person either has kidney cancer or doesn’t… a rate must apply to a group of people, such as a county

Kidney Cancer If the values in the kidney cancer dataset are yes/no, then what are the cases? The people living in the US The counties of the US A person either has kidney cancer or doesn’t. Yes/no doesn’t make sense for a county.

Categorical versus Quantitative Variables are classified as either categorical or quantitative: A categorical variable divides the cases into groups A quantitative variable measures a numerical quantity for each case

Categorical Quantitative Ask them to classify each as categorical or quantitative

Kidney Cancer If the cases in the kidney cancer dataset are counties, then the measured variable is… Categorical Quantitative Rates are numbers (quantitative).

Kidney Cancer If the cases in the kidney cancer dataset are people, then the measured variable is… Categorical Quantitative Either having kidney cancer or not is categorical.

(the answer to all of these questions is yes!) Variables For each of the following situations: What are the variables? Is each variable categorical or quantitative? Can eating a yogurt a day cause you to lose weight? Do males find females more attractive if they wear red? Does louder music cause people to drink more beer? Are lions more likely to attack after a full moon? (the answer to all of these questions is yes!)

Summary Data are everywhere, and pertain to a wide variety of topics A dataset is usually comprised of variables measured on cases Variables are either categorical or quantitative Data can be used to provide information about essentially anything we are interested in and want to collect data on!

To Do Read Section 1.1 If you haven’t already… Get the textbook Get a clicker or app for your smartphone and register it at http://www.iclicker.com/support/registeryourclicker/ (for Student ID use your NetID)

http://www.youtube.com/watch?v=nTBZuQR7 dRc&feature=youtu.be Why Statistics? http://www.youtube.com/watch?v=nTBZuQR7 dRc&feature=youtu.be