Så används statistiska metoder i jordbruksförsök Svenska statistikfrämjandets vårkonferens den 23 mars 2012 i Alnarp Johannes Forkman, Fältforsk, SLU.

Slides:



Advertisements
Similar presentations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Advertisements

Design and Analysis of Augmented Designs in Screening Trials
Hypothesis Testing Steps in Hypothesis Testing:
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Design of Experiments and Analysis of Variance
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
Design of Engineering Experiments - Experiments with Random Factors
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
The Statistical Analysis Partitions the total variation in the data into components associated with sources of variation –For a Completely Randomized Design.
Differentially expressed genes
Final Review Session.
Horng-Chyi HorngStatistics II_Five43 Inference on the Variances of Two Normal Population &5-5 (&9-5)
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Testing for differences between 2 means Does the mean weight of cats in Toledo differ from the mean weight of cats in Cleveland? Do the mean quiz scores.
Statistical Background
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Chapter 2 Simple Comparative Experiments
AP Statistics Section 12.1 A. Now that we have looked at the principles of testing claims, we proceed to practice. We begin by dropping the unrealistic.
Chapter 11: Inference for Distributions
Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.
Statistical Comparison of Two Learning Algorithms Presented by: Payam Refaeilzadeh.
Quantitative Methods Designing experiments - keeping it simple.
5-3 Inference on the Means of Two Populations, Variances Unknown
1 Introduction to mixed models Ulf Olsson Unit of Applied Statistics and Mathematics.
Linear Regression/Correlation
Design & Analysis of Split-Plot Experiments (Univariate Analysis)
Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.
Experimental Statistics - week 2
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
Comparing Two Samples Harry R. Erwin, PhD
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
Randomized block trials with spatial correlation Workshop in Mixed Models Umeå, August 27-28, 2015 Johannes Forkman, Field Research Unit, SLU.
Chapter 13Design & Analysis of Experiments 8E 2012 Montgomery 1.
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Experimental Design An Experimental Design is a plan for the assignment of the treatments to the plots in the experiment Designs differ primarily in the.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Business Statistics for Managerial Decision Comparing two Population Means.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates.
DOX 6E Montgomery1 Design of Engineering Experiments Part 9 – Experiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have.
CHAPTER 4 Analysis of Variance One-way ANOVA
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 9 Inferences Based on Two Samples Confidence Intervals and Tests of Hypotheses.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Psychology 202a Advanced Psychological Statistics October 6, 2015.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Comparing Means Chapter 24. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Why Model? Make predictions or forecasts where we don’t have data.
MEASURES OF CENTRAL TENDENCY Central tendency means average performance, while dispersion of a data is how it spreads from a central tendency. He measures.
Generalized Linear Models
i) Two way ANOVA without replication
Psychology 202a Advanced Psychological Statistics
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Relationship between mean yield, coefficient of variation, mean square error and plot size in wheat field experiments Coefficient of variation: Relative.
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
OVERVIEW OF LINEAR MODELS
Presentation transcript:

Så används statistiska metoder i jordbruksförsök Svenska statistikfrämjandets vårkonferens den 23 mars 2012 i Alnarp Johannes Forkman, Fältforsk, SLU

Agricultural field experiments Experimental treatments Varieties Weed control treatments Plant protection treatments Tillage methods Fertilizers

Experimental design Allocate Treatments A and B to eight plots... AAAABBBB ABABABAB ABBAABBA Option 1: Option 2: Option 3:

Systematic error The plots differ... The treatments are not compared on equal terms. There will be a systematic error in the comparison of A and B.

Randomise the treatments. This procedure transforms the systematic error into a random error. R. A. Fisher

Example TreatmentYield (kg/ha)Mean (kg/ha) A8165 A7792 A8397 A B8483 B8602 B8641 B The difference is 598

Randomisation test The observed difference is 598 kg/ha. There are 8!/(4! 4!) = 70 possible random arrangements. The two most extreme differences are 598 and P-value = 2/70 = 0.029

t-test Compare with a t-distribution with 6 degrees of freedom P-value = 0.011

The randomisation model

The approximate model

A crucial assumption Unit-Treatment additivity: Variances and covariances do not depend on treatment

Heterogeneity A B

Inference about what?? Randomisation model: The average if the treatment was given to all plots of the experiment. The approximate model: The average if the treatment was given to infinitely many plots? Sample Population

Variance in a difference

Independent errors Randomisation gives approximately independent error terms Information about plot position was ignored This information can be utilized BABAABBA

Tobler’s law of geography “Everything is related to everything else, but near things are more related than distant things.” Waldo Tobler

Random fields The random function Z(s) is a stochastic process if the plots belong to a space in one dimension random field, if the plots belong to a space in two or more dimensions

Spatial modelling Can improve precision. Still rare in analysis of agricultural field experiments. There are many possible spatial models and methods. Can be used whether or not the treatments were randomized... Which is the best design for spatial analysis?

Randomised block design

Incomplete block design Strata Replicates Blocks Plots

Ofullständiga block

DACB BCAD CBDABDCACBAD BCAD Replicate I Replicate II Strata Replicates Plots Subplots Split-plot design

DACB BCAD CBDABDCACBAD BCA D Replicate I Replicate II DA C BBCAD C BDABD C A C BADBCAD D A C BB C A DC B D AB DC A C BA D B C A D DACBBCAD C BD A BD CA CBADBCAD D A CBBCAD C BDABD C ACB A DBCAD 1a 1b 2a 2b 3 Comparison

sown conventionally sown with no tillage cultivar 2 cultivar 1 cultivar 3 Mo applied Each replicate: A design with several strata Bailey, R. A. (2008). Design of comparative experiments. Cambridge University Press.

The linear mixed model y = X  + Zu + e X: design matrix for fixed effects (treatments) Z: design matrix for random effects (strata) u is N(0, G)e is N(0, R)

Bates about error strata “Those who long ago took courses in "analysis of variance" or "experimental design" that concentrated on designs for agricultural experiments would have learned methods for estimating variance components based on observed and expected mean squares and methods of testing based on "error strata". (If you weren't forced to learn this, consider yourself lucky.) It is therefore natural to expect that the F statistics created from an lmer model (and also those created by SAS PROC MIXED) are based on error strata but that is not the case.”

Approximate t and F-tests The number of degrees of freedom is an issue. SAS: the Satterthwaite or the Kenward & Roger method. when L is one-dimensional, and otherwise.

Likelihood ratio test Full model (FM): p parameters Reduced model (RM): q parameters is asymptotically  2 with p – q degrees of freedom.

Bayesian analysis y = X  + Zu + e u is N(0, G)e is N(0, R) G is diag(Φ) R is diag(σ 2 ) Independent priori distributions: p(  ), p(Φ) Sampling from the posterior distribution: p(  Φ | y)

P-values in agricultural research Only discuss statistically significant results Do not discuss biologically insignificant results (although they are statistically significant). “Limit statements about significance to those which have a direct bearing on the aims of the research”. (Onofri et al., Weed Science, 2009)

Shrinkage estimators Galwey (2006). Introduction to mixed modelling. Wiley.

Fixed or random varieties? Fixed varieties (BLUE) Few varieties Estimation of differences Random varieties (BLUP) Many varieties Ranking of varieties

Conclusions based on a simulation study i.Modelling treatment as random is efficient for small block experiments. ii.A model with normally distributed random effects performs well, even if the effects are not normally distributed. iii.Bayesian methods can be recommended for inference about treatment differences.

Summary Fisher’s ideas about randomisation and blocking are still predominant. Strong focus on p-values. Linear mixed models are used extensively. Spatial and Bayesian methods are used less often. The question is what is random and fixed, and how to calculate p-values.

Tack för uppmärksamheten!