Lecture 4 Outline Chapter 1.3, 1.5 Control in Experimental Design Causal Inference in Observational Studies Summarizing Data –Numerical methods –Graphical.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
1 Chapter 1: Sampling and Descriptive Statistics.
Displaying & Summarizing Quantitative Data
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
Lecture 1 Outline: Tue, Jan 13 Introduction/Syllabus Course outline Some useful guidelines Case studies and
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Lecture 5 Outline: Thu, Sept 18 Announcement: No office hours on Tuesday, Sept. 23rd after class. Extra office hour: Tuesday, Sept. 23rd from 12-1 p.m.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 3: Central Tendency And Dispersion.
Lecture 4 Outline: Tue, Sept 16 Chapter 1.4.2, Chapter 1.5, additional material on sampling units and meaningful comparisons –Review of probability models.
Lecture 3 Outline: Tues, Jan 20 Chapter 1.3 Probability model for 2-group randomized experiment. Hypothesis testing review Randomization test p-value Principle.
Class 25: Thurs., Dec. 9th Today (Final class): Design of Experiments, summary of course. Schedule: –Mon., Dec. 13 th (5 pm) – Preliminary results from.
Understanding and Comparing Distributions
Statistics: Use Graphs to Show Data Box Plots.
Histogram A frequency plot that shows the number of times a response or range of responses occurred in a data set.
Describing distributions with numbers
Objective To understand measures of central tendency and use them to analyze data.
REPRESENTATION OF DATA.
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
Statistics 3502/6304 Prof. Eric A. Suess Chapter 3.
Welcome to Math 6 Statistics: Use Graphs to Show Data Histograms.
Module 8 Test Review. Now is a chance to review all of the great stuff you have been learning in Module 8! Statistical Questioning Measurement of Data.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Chapter 21 Basic Statistics.
1 Chapter 3 Looking at Data: Distributions Introduction 3.1 Displaying Distributions with Graphs Chapter Three Looking At Data: Distributions.
Describing distributions with numbers
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Summarizing Quantitative Data. We have discussed how to display data in a histogram. Today learn to describe how data is distributed.
Bellwork 1. If a distribution is skewed to the right, which of the following is true? a) the mean must be less than the.
The hypothesis that most people already think is true. Ex. Eating a good breakfast before a test will help you focus Notation  NULL HYPOTHESIS HoHo.
Math 3680 Lecture #1 Graphical Representation of Data.
BUSINESS STATISTICS I Descriptive Statistics & Data Collection.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Displaying Quantitative Data AP STATS NHS Mr. Unruh.
Sampling ‘Scientific sampling’ is random sampling Simple random samples Systematic random samples Stratified random samples Random cluster samples What?
Statistics and Data Analysis
Chapter 3: Displaying and Summarizing Quantitative Data Part 1 Pg
Stem-and-Leaf Plots, Histograms, and Circle Graphs Objective: To graph and analyze data in many different ways.
Chapter 14 Statistics and Data Analysis. Data Analysis Chart Types Frequency Distribution.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Analyzing Data Week 1. Types of Graphs Histogram Must be Quantitative Data (measurements) Make “bins”, no overlaps, no gaps. Sort data into the bins.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Descriptive Statistics – Graphic Guidelines Pie charts – qualitative variables, nominal data, eg. ‘religion’ Bar charts – qualitative or quantitative variables,
Prof. Eric A. Suess Chapter 3
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
Bellwork 1. Order the test scores from least to greatest: 89, 93, 79, 87, 91, 88, Find the median of the test scores. 79, 87, 88, 89, 91, 92, 93.
ISE 261 PROBABILISTIC SYSTEMS
Statistics Unit Test Review
4. Interpreting sets of data
Statistical Reasoning
Honors Statistics Chapter 4 Part 4
Topic 5: Exploring Quantitative data
Math Review #3 Jeopardy Random Samples and Populations
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Chapter 1: Exploring Data
Ticket in the Door GA Milestone Practice Test
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Lecture 4 Outline Chapter 1.3, 1.5 Control in Experimental Design Causal Inference in Observational Studies Summarizing Data –Numerical methods –Graphical methods

The meaning of the causal inference In the motivation-creativity study, we concluded that there is a strong evidence that the “intrinsic questionnaire” treatment caused a difference in creativity compared to the “extrinsic questionnaire” treatment. This difference could be caused by anything that differs between the two treatments, e.g, the actual questionnaire, the order in which the poems were judged, the relative preferences of the judges for the two treatments.

Control in Experimental Design The principle of control in experimental design is to make sure that all other factors besides the intended treatments are kept the same in the different groups. Then we can conclude that the intended treatment causes a difference between the groups. Examples of control: – Use a placebo for the control group. – Double blinding – Judge poems in random order.

Causal Inference in an Observational Study In an observational study, we could assume that, unbeknownst to us, the subjects were randomly assigned to treatments (i.e., there are no confounding variables). Then we could use the randomization test p-value to make inferences. But this is a “fictitious” probability model which might or might not be valid. Inferences based on a randomized experiment are much stronger because the probability model on which they are based (that of random assignment) is known to be correct.

Meaningful Comparisons Main lesson of chapter: The best way to compare two (or more) groups is to do a random experiment or take a random sample. This avoids systematic bias due to confounding variables and selection bias But if this is not possible, we should generally try to make the groups as “comparable” as possible by adjusting for known confounding variables and selection biases. Often times, important first steps are to use an appropriate control group and to compare the appropriate rate rather than absolute numbers

Control Group In a randomized experiment, we want the treatment and control group to be similar in every way except that one takes the treatment and the other doesn’t, i.e., we use placebo and double blinding. Similarly in an observational study, we want to compare the treatment group to a control group that is as similar as possible. Explain the need for a control group by criticizing the statement “A study on the benefits of vitamin C showed that 90% of the people suffering from a cold who take vitamin C get over their cold within a week”

Use of Rates An article in This Week magazine says that if you went “hurtling down the highway at 70 miles an hour, careening from side to side,” you would have four times as good a chance of staying alive if the time were seven in the morning than seven at night. The evidence: “Four times more fatalities occur on the highways at 7 p.m. than 7 a.m.” Does the conclusion follow from the evidence? More accidents occur in clear weather than foggy weather. Is clear weather safer to drive in?

Experimental Design Example: Salk Vaccine Field Trial In the first half of the 20 th century, polio was one of the most frightening diseases, striking hardest at young children and leaving many helpless cripples. By the 1950s, Jonas Salk developed a vaccine for polio that had proved promising in laboratory experiments but it was necessary to try it in the real world before releasing it for general use.

Designs for Salk Vaccine Field Trial Historical Control Approach: Distribute the vaccine as widely as possible, through the schools, to see whether the rate of reported polio was appreciably less than usual during the subsequent season. Observed Control Approach: Offer vaccination to all children in the second grade of participating schools and follow the polio experience not only in these children but in the first and third grade children. Placebo Control Approach: Choose the control group from the same population as the treatment group – children whose parents consented to vaccination. Assign the treatment randomly. Give a placebo to control group. Do not tell doctors which group children belong to.

Polio Example Using figure 1 as an example, explain why a contemporaneous control group is needed in experiments where the effectiveness of a drug or vaccine is being tested? Comment on the use of the number of cases. What would be a more appropriate indicator of whether polio incidence was increasing?

Summarizing Data Numerical summaries –Measures of center: mean, median mode –Measures of spread: sample standard deviation ( ), interquartile range Graphical Methods –Relative frequency histograms –Stem and leaf diagrams –Box plots

Relative Frequency Histograms A histogram is a graph that shows the relative frequency per unit of measurement. The areas of blocks represent the percentage of observations in the blocks. The heights of the blocks represent relative frequency per unit of measurement, i.e., crowding – percentage per unit of measurement Histograms show broad features – particularly the center, spread and shape of the distribution (symmetric or skewed, light tailed or heavy tailed).

Histograms in JMP Click Analyze, then Distribution Click red triangle next to Distributions, stack to see horizontal layout Click tools, hand (grabber in Version 5) and click on histogram, drag to change position of bars. To make histograms by group (e.g., sex discrimination), put Salaries in Y and Sex in By box. Click red triangle next to distributions and click Stack to display horizontally. For both groups, click red triangle next to distributions and click Uniform Scaling to display histograms on same scale.

Stem and leaf diagrams Cross between graph and table Gives quick idea of distribution Shows center, spreads and shapes as does histogram but also shows exact values, easy to construct by hand, median can be computed. Stem and leaf plots in JMP –Click Analyze, Distribution –Put variable of interest in Y and click OK –Click red triangle next to variable of interest (e.g., salaries) and click Stem and Leaf –Back to back stem and leaf plots are not available in JMP but are useful (see page 17)

Box plots Middle 50% of a group of measurements is represented by a box. –Line in middle of box is the median Various features of upper and lower 25% by other symbols –The whiskers extend to the farthest point that is within 1.5 interquartile ranges of upper and lower quartiles. (IQR=third quartile – first quartile) –Points farther away are shown individually as outliers. –Width of a box plot is chosen to make the box look nice; it does not represent any aspect of data.

Box plots in JMP To draw one box plot –Click Analyze, Distribution. To draw side by side box plots –Click Analyze, Fit Y by X, putting outcome in Y and group variable in X –Click red triangle next to One Way Analysis, click Display Options and then click Box Plot (this produces box plots that display the box, the whiskers and all of the data points individually). Display 1.13 shows histograms and box plots for four types of distributions.