Outline Some field sampling issues Overview of approach to understanding a system Example 1 – KPBS Example 2 – Xoo Example 3 – SIR model Through approach.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 21, Slide 1 Chapter 21 Comparing Two Proportions.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Statistics Micro Mini Threats to Your Experiment!
Topic 2: Statistical Concepts and Market Returns
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
Experimental Evaluation
Inferences About Process Quality
Today Concepts underlying inferential statistics
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Descriptive Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
Choosing Statistical Procedures
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 7 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Motive Konza: understanding disease, since there is no apparent reason to manage native pathogens of native plants Also have background information in.
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
HYPOTHESIS TESTING Dr. Aidah Abu Elsoud Alkaissi
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Statistical Analysis Statistical Analysis
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
1 Chapter 1: Introduction to Design of Experiments 1.1 Review of Basic Statistical Concepts (Optional) 1.2 Introduction to Experimental Design 1.3 Completely.
Inference for a Single Population Proportion (p).
Comparing Two Proportions
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
1 Virtual COMSATS Inferential Statistics Lecture-16 Ossam Chohan Assistant Professor CIIT Abbottabad.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
1 Chapter 1: Introduction to Design of Experiments 1.1 Review of Basic Statistical Concepts (Optional) 1.2 Introduction to Experimental Design 1.3 Completely.
Research & Experimental Design Why do we do research History of wildlife research Descriptive v. experimental research Scientific Method Research considerations.
Copyright  2003 by Dr. Gallimore, Wright State University Department of Biomedical, Industrial Engineering & Human Factors Engineering Human Factors Research.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Retain H o Refute hypothesis and model MODELS Explanations or Theories OBSERVATIONS Pattern in Space or Time HYPOTHESIS Predictions based on model NULL.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Inferential Statistics Psych 231: Research Methods in Psychology.
Statistics 22 Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they.
WELCOME TO BIOSTATISTICS! WELCOME TO BIOSTATISTICS! Course content.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Logic of Hypothesis Testing
Comparing Two Proportions
Psych 231: Research Methods in Psychology
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Comparing Two Proportions
Comparing Two Proportions
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Psych 231: Research Methods in Psychology
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Outline Some field sampling issues Overview of approach to understanding a system Example 1 – KPBS Example 2 – Xoo Example 3 – SIR model Through approach again with these three examples Karen A. Garrett Kansas State University

Rust fungi Phytophthora infestans, an oomycete Wheat curl mite, vector of Wheat streak mosaic virus Cercospora apii infects both humans and… celery

Designed experiments vs. observational experiments Designed experiments generally have a more straightforward analysis Observational experiments rely more on correlation, so that interpreting causality may be more difficult Many experiments in disease ecology have some designed elements and some observational elements

Ratio of Phaeosphaeria nodorum to Mycosphaerella graminicola compared to sulfur dioxide emissions Bearchell et al PNAS

Defining an inference space The inference space of an experiment is the group to which the experimental conclusions can be correctly applied The pool from which the experimental units are randomly drawn will clearly be part of the inference space Logic outside statistical inference may be used to extend results to broader set of units Definition of this space allows definition of the appropriate experimental unit

Pseudoreplication Pseudoreplication occurs when repeated observations of a subject are substituted for replicated applications of a treatment on different subjects In general, if it seems that the number of replicates can be increased indefinitely by splitting samples in increasingly smaller units, these are probably pseudoreplicates “What is objectionable is when the tentative conclusions derived from unreplicated treatments are given an unmerited veneer of rigor by the erroneous application of inferential statistics” - Hurlbert

Classic example of pseudoreplication Pseudoreplicates True replicates Scenario in which an individual mite is the appropriate experimental unit

Pseudoreplication – ex 2 – pseudoreplication in spatial samples Suppose that a treatment has been applied at the larger scale Pseudoreplication

Suppose there is no treatment application or experimental design? Defining pseudoreplication in an observational study is more challenging The variance associated with sampling at different spatial scales or across different types of groups of individuals can be compared to determine what are the largest sources of variation

Important note about correlation and newer statistical packages Recall the standard assumption for typical analyses of variance that observations are independent In the past, and sometimes in the present, people might disregard the possibility of using packages like SAS Proc GLM because they knew their samples were not truly independent Newer programs like SAS Proc Mixed (and programs in R?) make it easier to specify more complicated correlation matrices for the errors in an analysis of variance

Statistical power Statistical power: the probability of detecting treatment effects that really exist Scientists have tended to emphasize controlling the Type I error rate (the probability of designating an effect as “significant” when it is not real) rather than maximizing power This seems to be based on the idea that journals should not be cluttered with reports of a lot of effects that are not real However, if you want to manage a disease, discarding an effect because the associated p-value is greater than 0.05 may lead you to leave out important effects Real effects may be difficult to detect because of noise Sensitivity analyses can be used to explore the implications of removing an effect when it is actually real

Parsimony On the other hand, parsimony is a good general goal Statistical models need to strike a balance to avoid leaving out important predictors and also to avoid overparameterizing Mechanistic models can be applied to explore the potential impacts of many predictors

Statistical power Power is increased by reducing measurement errors and by increasing sample size Just because a null hypothesis has not been rejected doesn’t mean that there are no treatment effects

Testing for bioequivalence Bioequivalence tests can be used to formally test whether there is no difference between the effects of treatments (within some tolerance)

Garrett 1997

Defining a “biological tolerance level” A sensitivity analysis might be used to define a tolerance level for effects below which there is not expected to be any important impact Formal discrimination between statistical significance and biological significance BUT…you would need to have a great deal of confidence in your model to rely on this for management decisions

Meta-analysis applications in plant pathology Comparisons across studies can be formalized in meta- analyses We have illustrated the application of meta-analysis to the large quantities of data available from plant pathology field trials Rosenberg, Garrett, Su, and Bowden 2004 Phytopathology

Metadata The National Center for Ecological Analysis and Synthesis works with metadata and metadata standards as one of its many projects web/resources/metadata.htmlhttp:// web/resources/metadata.html

For discussing the disease data set analyses Here are some suggestions for pondering your data sets and projects You might consider addressing these questions in your discussions and final presentation

Defining the goals of the project A. What is the motivation for the project? –Understanding the system better In what way in particular? –Learning to manipulate the disease What are the potential methods for manipulation? B. What are the hypotheses to be tested or parameters to be estimated? –Will the project be sufficient to test hypotheses? –Or will it more appropriately generate hypotheses to be tested in a more controlled context?

Variables and parameters What are the potential predictor and response variables? What are the parameters to be estimated? When using parameter estimates from an experiment in a mechanistic simulation model, the estimates might be viewed as values to emphasize while considering a wider range of possible values

Studying the distribution of variables It may be necessary to split variables into logical groups, such as by environment For example, if environment has a large effect, analyzing the disease severity for samples from all environments in the same analysis might produce a multi- model distribution

What are sources of bias? Since samples may not have been collected specifically to answer later questions, estimates may be biased for some questions For example, rather than random sampling, specific individuals may have been sampled because of their observed characteristics (symptoms, family size, …) True random sampling is often a challenge, anyway

Deciding what to average prior to analysis Once the appropriate experimental unit is identified, you might average the subsamples within a unit Possibly the subsample variance is interesting in its own right and you would like to include it in analyses You can keep all the individual subsample measures in the analysis if you are careful to use the correct error estimates for testing effects

Is there a widely accepted model for this system already available? Can your data be used to further validate this model or perhaps as an example of a case in which the model does not hold? Does your data add a new component to this model, such as considering the effects of a novel environmental parameter?

If there are not already accepted models for your system… Is there a related system that has been studied more, modeled, and might be used as a starting point for considering your system? For example, SIR models might be generally applied for many types of disease

Iteration between input from experimental analyses and input from modeling Modeling Construction of new hypotheses and predictions Empirical experimentation: Testing of hypotheses in experiments; generation of new parameter estimates; generation of new hypotheses Modeling Construction of new hypotheses and predictions Empirical experimentation: Testing of hypotheses in experiments; generation of new parameter estimates; generation of new hypotheses

Sensitivity analysis Analysis of model output for a range of parameter and variable inputs – analysis of the sensitivity of outputs to changes in the inputs The distribution of outputs for a particular set of inputs can be evaluated in terms not only of the mean or median, but also the maxima and minima

Model validation A data set might be split by location, so that a model developed based on one subset of locations is validated using another subset A data set might be split by time, so that a model developed based on earlier time points is validated using later time points