Objectives (PSLS Chapter 19)

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

Objectives (PSLS Chapter 19) Inference for a population proportion Conditions for inference on proportions The sample proportion (phat) The sampling distribution of Significance test for a proportion Confidence interval for p Sample size for a desired margin of error

Conditions for inference on proportions Assumptions: The data used for the estimate are a random sample from the population studied. The population is at least 20 times as large as the sample. This ensures independence of successive trials in the random sampling. The sample size n is large enough that the shape of the sampling distribution is approximately Normal. How large depends on the type of inference conducted.

The sample proportion p̂ We now study categorical data and draw inference on the proportion, or percentage, of the population with a specific characteristic. If we call a given categorical characteristic in the population “success,” then the sample proportion of successes, (phat) is: We treat a group of 120 Herpes patients with a new drug; 30 get better: p̂ = (??)/(??) = ?? (proportion of patients improving in sample)

Sampling distribution of p̂ The sampling distribution of 𝑝 is never exactly Normal. But for large enough sample sizes, it can be approximated by a Normal curve.

The mean and standard deviation (width) of the sampling distribution are both completely determined by p and n. Therefore, we won’t need to use a t distribution (unlike with inference for means).

Significance test for p When testing: H0: p = p0 (a given value we are testing) If H0 is true, the sampling distribution is known  The test statistic is the standardized value of 𝑝 This is valid when both expected counts — expected successes np0 and expected failures n(1 − p0) — are each 10 or larger.

P-value for a one or two sided alternative The P-value is the probability, if H0 was true, of obtaining a test statistic like the one computed or more extreme in the direction of Ha. And as always, if the P-value is smaller than the chosen significance level α, the effect is statistically significant and we reject H0.

Aphids evade predators (ladybugs) by dropping off the leaf Aphids evade predators (ladybugs) by dropping off the leaf. An experiment examined the mechanism of aphid drops. “When dropped upside-down from delicate tweezers, live aphids landed on their ventral side in 95% (sample proportion) of the trials (19 out of 20). In contrast, dead aphids landed on their ventral side in 52.2% of the trials (12 out of 23).” Is there evidence (at significance level 5%) that live aphids land right side up (on their ventral side) more often than chance would predict? Here, “chance” would be 50% ventral landings. So we test: The expected counts of success and failure are each 10, so the z procedure is valid. The test P-value is P(Z ≥ 4.02). From Table B, P = P(Z < -4.02) < .0002, highly significant. We reject H0. There is very strong evidence (P < .0002) that the righting behavior of live aphids is better than chance.

The sample proportion of smooth peas is: We test: Mendel’s first law of genetic inheritance states that crossing dominant and recessive homozygote parents yields a second generation made of 75% of dominant-trait individuals. When Mendel crossed pure breeds of plants producing smooth peas and plants producing wrinkled peas, the second generation (F2), was made of 5474 smooth peas and 1850 wrinkled peas. Do these data provide evidence that the proportion of smooth peas in the F2 population is not 75%? The sample proportion of smooth peas is: We test: From Table B, we find P = 2P(Z < -.59) = 2 x .2776 = .56, not significant. We fail to reject H0. The data are consistent with a dominant-recessive genetic model. However, it is important to remember that we cannot “prove” that the null hypothesis is true, only that it is a possibility.

Confidence interval for p When p is unknown, both the center and the spread of the sampling distribution are unknown  problem. We need to “guess” a value for p. Our options: This is the “large sample method”. It performs poorly when sample size is small. This is the “plus four method”. It is reasonably accurate. Always use with caution

Large-sample confidence interval for p Confidence intervals contain the population proportion p in C % of samples. For an SRS of size n drawn from a large population and with sample proportion p̂ calculated from the data, an approximate level C confidence interval for p is C z* -z* m m Use this method when the number of successes and the number of failures are both at least 15. C is the area under the standard normal curve between -z* and z*.

Medication side effects Arthritis is a painful, chronic inflammation of the joints. An experiment on the side effects of pain relievers examined arthritis patients to find the proportion of patients who suffer side effects. What are some side effects of ibuprofen? Serious side effects (seek medical attention immediately): Allergic reactions (difficulty breathing, swelling, or hives) Muscle cramps, numbness, or tingling Ulcers (open sores) in the mouth Rapid weight gain (fluid retention) Seizures Black, bloody, or tarry stools Blood in your urine or vomit Decreased hearing or ringing in the ears Jaundice (yellowing of the skin or eyes) Abdominal cramping, indigestion, or heartburn Less serious side effects (discuss with your doctor): Dizziness or headache Nausea, gaseousness, diarrhea, or constipation Depression Fatigue or weakness Dry mouth Irregular menstrual periods.

We compute a 90% confidence interval for the population proportion of arthritis patients who suffer some "adverse symptoms." Out of 442, 23 suffered some adverse symptoms. For a 90% confidence level, z* = 1.645. Using the large sample method:  With 90% confidence level, between 3.5% and 6.9% of arthritis patients taking this pain medication experience some adverse symptoms.

“Plus four” confidence interval for p The “plus four” method gives reasonably accurate confidence intervals. We act as if we had four additional observations, two successes and two failures. Thus, the new sample size is n + 4 and the count of successes is X + 2. The “plus four” estimate of p is: An approximate level C confidence interval is: Use this method when C is at least 90% and sample size is at least 10.

We want a 90% CI for the population proportion of arthritis patients who suffer some “adverse symptoms.” What is the value of the “plus four” estimate of p? An approximate 90% confidence interval for p using the “plus four” method is:  With 90% confidence, between 3.8% and 7.4% of the population of arthritis patients taking this pain medication experience some adverse symptoms.

Sample size for a desired margin of error You may need to choose a sample size large enough to achieve a specified margin of error. Because the sampling distribution of p̂ is a function of the unknown population proportion p this process requires that you guess a likely value for p: p*. Try this out: the value p*(1-p*) is greatest when p* = 0.5. Make an educated guess, or use p* = 0.5 (most conservative estimate).

For a 90% confidence level, z* = 1.645. What sample size would we need in order to achieve a margin of error no more than 0.01 (1 percentage point) with a 90% confidence level? We could use 0.5 for our guessed p*. However, since the drug has been approved for sale over the counter, we can safely assume that no more than 10% of patients should suffer “adverse symptoms” (a better guess than 50%). For a 90% confidence level, z* = 1.645. Note: For a 0.03 margin of error we would need only 271 arthritis patients – do the calculations and check your answer  To obtain a margin of error no more than 0.01 we need a sample size n of at least 2436 arthritis patients.

Sample size and margin of error continued Ex) What sample size would we need in order to achieve a margin of error no more than 0.03 (3 percentage point) with a 95% confidence level? We need at least 385.