Statistical Issues in Research Planning and Evaluation

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Hypothesis Testing IV Chi Square.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Behavioural Science II Week 1, Semester 2, 2002
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Understanding Research Results. Effect Size Effect Size – strength of relationship & magnitude of effect Effect size r = √ (t2/(t2+df))
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Chapter 14 Inferential Data Analysis
Relationships Among Variables
Inferential Statistics
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
HYPOTHESIS TESTING Dr. Aidah Abu Elsoud Alkaissi
Chapter 8 Introduction to Hypothesis Testing
Hypothesis Testing.
Comparing Means From Two Sets of Data
Statistical Power The ability to find a difference when one really exists.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Chapter 7 Statistical Issues in Research Planning and Evaluation.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Chapter 8 Introduction to Hypothesis Testing
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Chapter 9 Power. Decisions A null hypothesis significance test tells us the probability of obtaining our results when the null hypothesis is true p(Results|H.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Analyzing Statistical Inferences How to Not Know Null.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Chapter 10 The t Test for Two Independent Samples
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Results: How to interpret and report statistical findings Today’s agenda: 1)A bit about statistical inference, as it is commonly described in scientific.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Chapter 13 Understanding research results: statistical inference.
Chapter 7 Statistical Issues in Research Planning and Evaluation.
Chapter ?? 7 Statistical Issues in Research Planning and Evaluation C H A P T E R.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Chapter 9 Introduction to the t Statistic
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
The ability to find a difference when one really exists.
INF397C Introduction to Research in Information Studies Spring, Day 12
Chapter 8: Hypothesis Testing and Inferential Statistics
Hypothesis Testing: Hypotheses
More about Tests and Intervals
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Chapter 12 Power Analysis.
Chapter 7: Statistical Issues in Research planning and Evaluation
1 Chapter 8: Introduction to Hypothesis Testing. 2 Hypothesis Testing The general goal of a hypothesis test is to rule out chance (sampling error) as.
Presentation transcript:

Statistical Issues in Research Planning and Evaluation Chapter 7 Statistical Issues in Research Planning and Evaluation Research Methods in Physical Activity

Another pertinent approach to probability involves relative frequency. To plan your own study or evaluate a study by someone else, you need to understand these concepts and their interrelationships: probability, alpha, power, sample size, and effect size. Probability — The odds that a certain event will occur. A concept of probability related to statistics is called equally likely events. equally likely events — A concept of probability in which the chances of one event occurring are the same as the chances of another event occurring. For example, if you roll a die, the chances of the numbers from 1 to 6 occurring are equally likely. Another pertinent approach to probability involves relative frequency. relative frequency - A concept of probability concerning the comparative likelihood of two or more events occurring. For example, suppose that you toss a coin 100 times. You would expect heads 50 times and tails 50 times; the probability of either result is one-half, or .50. When you toss, however, you may get heads 48 times, or .48. This is the relative frequency. You might perform 100 tosses 10 times and never get .50, but the relative frequency would be distributed closely around .50, and you would still assume the probability as .50. Research Methods in Physical Activity

alpha (level of significance) Probability in statistical tests In a statistical test, you sample from a population of participants and events. You use probability statements to describe the confidence that you place in the statistical findings. Frequently, you encounter a statistical test followed by a probability statement such as p < .05. This interpretation is that a difference or relationship of this size would be expected less than 5 times in 100 as a result of chance. alpha (level of significance) In research, the test statistic is compared with a probability table for that statistic, which tells you what the chance occurrence is. In behavioral research, alpha (probability of chance occurrence) is frequently set at .05 or .01 (the odds that the findings are due to chance are either 5 in 100 or 1 in 100). These values are used to control for a type I error. Research Methods in Physical Activity

Error Types (Type I and Type II Errors) In a study, the experimenter may make two types of error. A type I error is to reject the null hypothesis when the null hypothesis is true. For example, a researcher concludes that there is a difference between two methods of training, but there really is not. A type II error is not to reject the null hypothesis when the null hypothesis is false. For example, a researcher concludes that there is no difference between the two training methods, but there really is a difference. Figure 7.1 (p. 116) is called a truth table, which displays type I and type II errors. You control for type I errors by setting alpha. For example, if alpha is set at .05, then if 100 experiments are conducted, a true null hypothesis of no difference or no relationship would be rejected on only 5 occasions. To some extent the issue is this: If you had to make an error, which type of error would you be willing to make? The level of alpha reflects the type of error that you are willing to make. Research Methods in Physical Activity

beta Acceptable variations in reporting alpha in research Even when experimenters set alpha at a specific level (e.g., .05) before the research, they often report the probability of a chance occurrence for the specific effects of the study at the level it occurred (e.g., p = .012). This procedure is appropriate (and recommended), because the researchers are only demonstrating to what degree the level of probability exceeded the specified level. beta beta is the magnitude of the type II error. Although the magnitude of type I error is specified by alpha, you may also make a type II error, the magnitude of which is determined by beta (ß). See Figure 7.2, (p.118) - you can see the overlap of the score distribution on the dependent variable for x (the sampling distribution if the null hypothesis is true) and y (the sampling distribution if the null hypothesis is false). Continued on next slide… Research Methods in Physical Activity

Beta By specifying alpha, you indicate that the mean of y (given a certain distribution) must be at a specified distance from the mean of x before the null hypothesis is rejected. But if the mean of y falls anywhere between the mean of x and the specified y, you could be making a type II error (beta); that is, you do not reject the null hypothesis when, in fact, there is a true difference. There is a relationship between alpha and beta; for example, as alpha is set increasingly smaller, beta becomes larger. Research Methods in Physical Activity

Meaningfulness (Effect Size) Meaningfulness is the importance or practical significance of an effect or relationship. The meaningfulness of a difference between two means can be estimated by effect size (also called delta). effect size (ES)— The standardized value that is the difference between the means divided by the standard deviation. (ES vs. Sample size listed in Tables 7.3 and 7.4, p.119) The formula for effect size is: = (M1 – M2)/s Also written as: Cohen's = M1 - M2 / spooled      where spooled = �[(s 1�+ s 2�) / 2] This formula subtracts the mean of one group (M1) from the mean of a second group (M2) and divides the difference by the standard deviation. That places the difference between the means in the common metric called standard deviation units. Pay attention to the formula – the greater the difference between the group means and the less variance within the groups = greater effect size. A 0.2 or less is a small ES, about 0.5 is a moderate ES, and 0.8 or more is a large ES. Research Methods in Physical Activity

Power Power is the probability of rejecting the null hypothesis when the null hypothesis is false (e.g., detecting a real difference), or the probability of making a correct decision. Power ranges from 0 to 1. The greater the power is, the more likely you are to detect a real difference or relationship. Rejecting the Null Hypothesis (power) is possible by increasing subject numbers, but are your results meaningful? There are important questions to answer: How large a difference is important in theory or practice? and, How many participants are needed to declare an important difference as significant? Understanding the concept of power can answer these questions. If a researcher can identify the size of an important effect through previous research or even simply estimate an effect size (e.g., 0.5 is a moderate ES) and establish how much power is acceptable (e.g., a common estimate in the behavioral science is 0.8), then the size of the sample needed for a study can be estimated. Research Methods in Physical Activity

Power (refer to figures 7.3 and 7.4) Power is simply calculated as 1 – beta Beta should be kept at 4x alpha (seriousness of type 1 vs. type II error) Thus if alpha is .05 then beta is .20, and power is calculated at 0.8 You can review the literature and determine the ES. Now you have calculated the power (ability to find differences) and effect size (meaningfulness), with a pre-selected alpha level. Based on this information, you can determine how many subject you will need to recruit per group to achieve the desired outcomes (to find differences) Note how this works – if you have a small ES, more subjects are required to find differences (obtain power), thus the introduction of the independent variable is not meaningful. Also, if you make the alpha level more stringent, and the ES is unchanged, you will have to recruit more subjects to find significant results. The sample size is extremely influential on power (see Table 7.1) Research Methods in Physical Activity

Power Keep in mind the relationships of alpha, sample size, and ES in planning a study. If you have access to only a small number of participants, then you need to have a really large ES or use a larger alpha, or both. Do not just blindly specify the .05 alpha if detecting a real difference is the main issue. Use a higher one, such as .20 or even .30. This approach is extremely pertinent in pilot studies. Research Methods in Physical Activity

Using Information in the Context of the Study Context — The interrelationships found in the real-world setting. Are effect sizes for significant findings large enough to be meaningful when interpreted within the context of the study, or for the application of findings to other related samples, or for planning a related study? Remember - Effect sizes are based on the difference between the means (divided by the standard deviation) The larger the effect size, the less the overlap between the distribution of scores in the two groups (control and experimental). In very small samples a single unusual value can substantially influence the results. Moreover, within- and between-participant variation (the error variance) tends to be large, which causes the error term in tests of significance to be large, resulting in few significant findings. On the other extreme of sample size, statistics has little value for very large samples because nearly any difference or relationship is significant. Research Methods in Physical Activity

----- End of presentation ---- Context Context is what matters with regard to meaningfulness. You must ask yourself, “Within the context of what I do, does an effect of this size matter?” The answer nearly always depends on who you are and what you are doing (and practically never on whether p = .05 or .01). Thus, having a significant (reliable) effect is a necessary, but not sufficient, condition in statistics. To meet the criteria of being both necessary and sufficient, the effect must be significant and meaningful within the context of its use. Said another way, ♦ estimates of significance are driven by sample size, ♦ estimates of meaningfulness are driven by the size of the difference, and ♦ context is driven by how the findings will be used. ----- End of presentation ---- Research Methods in Physical Activity