Computing for Research I Spring 2013

Slides:



Advertisements
Similar presentations
Hypothesis testing Another judgment method of sampling data.
Advertisements

October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Stat 301 – Day 17 Tests of Significance. Last Time – Sampling cont. Different types of sampling and nonsampling errors  Can only judge sampling bias.
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Multiple regression analysis
Elementary hypothesis testing
Discrete Event Simulation How to generate RV according to a specified distribution? geometric Poisson etc. Example of a DEVS: repair problem.
Hypothesis Testing: Type II Error and Power.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BS704 Class 7 Hypothesis Testing Procedures
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Statistics 03 Hypothesis Testing ( 假设检验 ). When we have two sets of data and we want to know whether there is any statistically significant difference.
Inference about Population Parameters: Hypothesis Testing
Statistical hypothesis testing – Inferential statistics I.
Probability Distributions 2014/04/07 Maiko Narahara
One Sample  M ean μ, Variance σ 2, Proportion π Two Samples  M eans, Variances, Proportions μ1 vs. μ2 σ12 vs. σ22 π1 vs. π Multiple.
Descriptive statistics Inferential statistics
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Week 9 Testing Hypotheses. Philosophy of Hypothesis Testing Model Data Null hypothesis, H 0 (and alternative, H A ) Test statistic, T p-value = prob(T.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Significance Testing Statistical testing of the mean (z test)
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
1 Statistical Distribution Fitting Dr. Jason Merrick.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Confidence intervals and hypothesis testing Petter Mostad
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
Lunch & Learn Statistics By Jay. Goals Introduce / reinforce statistical thinking Understand statistical models Appreciate model assumptions Perform simple.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
The z test statistic & two-sided tests Section
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
How confident are we in the estimation of mean/proportion we have calculated?
Simulations and programming in R. Why to simulate and program in R at all? ADVANTAGES –All R facilities can be used in the simulations Random number generators.
Math 4030 – 9a Introduction to Hypothesis Testing
© Copyright McGraw-Hill 2004
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Tests of Significance: The Basics ESS chapter 15 © 2013 W.H. Freeman and Company.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Review of Hypothesis Testing: –see Figures 7.3 & 7.4 on page 239 for an important issue in testing the hypothesis that  =20. There are two types of error.
IEE 380 Review.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing: Hypotheses
RCT Workshop, Bushehr 2/16/2019.
Hypothesis Testing and Confidence Intervals (Part 2): Cohen’s d, Logic of Testing, and Confidence Intervals Lecture 9 Justin Kern April 9, 2018.
Advanced data management
Simulation Berlin Chen
Presentation transcript:

Computing for Research I Spring 2013 R: Random number generation & Simulations April 7 Presented by: Liqiong Fan

How to sample from common distribution: Outline How to sample from common distribution: Uniform distribution Binomial distribution Normal distribution Pre-specified vector Examples: Randomization code generation Simulation 1 (explore the relationship between power and effect size) Simulation 2 (explore the relationship between power and sample size)

2. Sample from a vector: sample() Syntax for random number generation in R 1. Sample from a known distribution: “r” + name of distribution: e.g., runif() Uniform rbinom() Binomial rnorm() normal … 2. Sample from a vector: sample() e.g., extract two numbers from {1,2,3,4,5,6} with replacement

Uniform distribution (continuous) PDF Mean: Variance:

[R] Uniform distribution runif(n, min=0, max=1) See R code …

1 [R] Uniform distribution Use UNIFORM distribution to generate BERNOULLI distribution Basic idea: Uniform distribution Bernoulli distribution 1 See R code …

[R] Binomial distribution rbinom(n, size, prob) e.g. generate 10 Binomial random number with Binom(100, 0.6) n = 10 size = 100 prob = 0.6 rbinom(10, 100, 0.6) e.g. generate 100 Bernoulli random number with p=0.6 n = 100 size = 1 prob = 0.6 rbinom(100, 1, 0.6) See R code …

[R] Normal distribution rnorm(n, mean, sd) #random number dnorm(x, mean, sd) #density pnorm(q, mean, sd) #P(X<=q) cdf qnorm(p, mean, sd) #quantile See R code …

[R] Normal distribution dnorm(x, mean, sd) #density e.g. plot a standard normal curve pnorm(q, mean, sd) #probability P(X<=x) e.g. calculate the p-value for a one sides test with standardized test statistic H0: X<=0 H1: X>0 Reject H0 if “Z” is very large If from the one-sided test, we got the Z value = 3.0, what’s the p-value? P-value = P(Z>=z) = 1 - P(Z<=z) 1 - pnorm(3, 0, 1)

[R] Normal distribution qnorm(p, mean, sd) #quantile See R code … rnorm(n, mean, sd) #random number See R code …

[R] Another useful command for sampling from a vector – “sample()” e.g. randomly choose two number from {2,4,6,8,10} with/without replacement sample(x, size, replace = FALSE, prob = NULL) sample(c(2,4,6,8,10), 2, replace = F) 4 8 2 6 10

[R] Another useful command for sampling from a vector – “sample()” e.g. A question from our THEORY I CLASS: “Draw a histogram of all possible average of 6 numbers selected from {1,2,7,8,14,20} with replacement” Answer: A quick way to solve this question is to do a simulation: That is: we assume we repeat selection of 6 balls with replacement from left urn for many many times, and plot their averages. The R code is looked like: 14 20 8 a <- NULL for (i in 1:10000){ a[i] <- mean(sample(c(1,2,7,8,14,20),6, replace = T)) } hist(a) 1 2 7

[R] Another useful command for sampling from a vector – “sample()” e.g. Generate 1000 Bernoulli random number with P = 0.6 sample(x, size, replace = T, prob =) Answer: Let x = (0, 1), Let size = 1, Let replace = T/F, Let prob = (0.4, 0.6). Repeat 1000 times 1

Generate randomization sequence Example 1 Generate randomization sequence Goal: randomize 100 patients to TRT A and B 1. Simple randomization (like flipping a coin) – Bernoulli distribution 0 0 1 0 0 1 0 1 0 0 …. 1 0 1 0 runif(), rbinom(), sample(). See R code …

Generate randomization sequence Example 1 Generate randomization sequence Goal: randomize 100 patients to TRT A and B 2. Random allocation rule (RAL) Unlike simple randomization, number of allocation for each treatment need to be fixed in advance Again, think about the urn model! 50 Draw the balls without replacement 50 RAL can only guarantee treatment allocation is balanced toward the end.

Generate randomization sequence Example 1 Generate randomization sequence Goal: randomize 100 patients to TRT A and B 3. Permuted block randomization Block size = 4 AABB BABA BBAA BABA BAAB … BBAA sample() Think about multi urns model! 50 50 25 …

Example 2 Investigate the relationship between effect size and power – drug increases SBP Linear model: Y = b0 + b1X + e Y: Systolic Blood Pressure (response) X: intervention (1 = drug vs. 0 = control) e: random error = var(Y) When X=0, E(Y) = b0, effect of control; When X=1, E(Y) = b0 + b1, effect of drug; Between group difference is represented by b1 b1 represents the effect size of new drug relative to the control. For instance, assuming that the SBP in control population is distributed as N(120, 49), what is the power if the new drug can truly increase SBP by 0, 1, 2, 3, 4 and 5 units in a study with a sample size of 100 (50 in drug, 50 in placebo) b0 = 120 e ~ N(0, 49) Important information: Y (placebo) ~ N(120, 49)

Example 2 Investigate the relationship between effect size and power - drug increases BP Y: Blood Pressure (response) X: intervention (1 = drug vs. 0 = control) e: random error = var(Y) Linear model: Y = b0 + b1X + e b0 = 120 e ~ N(0, 49) Important information: Y (placebo) ~ N(120, 49) We try to answer: What’s the power given b1 (the real effect size of the treatment) is 0, 1, 2, 3, 4 or 5 Definition of Power: Probability of rejecting NULL when ALTERNATIVE IS TRUE (i.e., b1 = some non-zero value). If we run simulation for N times, power means the probability that b1 (treatment effect) shows significant (P<0.05) from linear regression tests out of N simulations

Example 2 Investigate the relationship between effect size and power - drug increases BP Y: Systolic Blood Pressure (response) X: intervention (1 = drug vs. 0 = control) e: random error = var(Y) Linear model: Y = b0 + b1X + e Simulation steps (E.g. sample size = 50/ per group, 1000 simulations): Generate X according to study design (50 “1”s and 50 “0”s); Generate 100 “e” from N(0, 49); Given b0 and b1, generate Y using Y = b0 + b1X + e; Use 100 pairs of (Y, X) to refit a new linear model, and get the new b0 and b1 and their p-value; Repeat these steps for 1000 times. If type I error is 0.05, for a two-sided test

Investigate the relationship between sample size and power Example 3 Investigate the relationship between sample size and power Linear model: Y = b0 + b1X + e We try to answer: What’s the power given b1 = 2 and sample size = 25, 50, 75, 100, 125, and 150 per group

Some recommendation 1. Try not “fix” the parameters in your simulation 2. Always test your code with small number of iterations before you actually start your simulation 3. Use append / write.table (… append = T …) to save the result or simulated data 4. Print the number of interations / senarios Code: print(c) flush.console()