Introduction to statistical estimation methods Finse Alpine Research Center, 10-11 September 2010.

Slides:



Advertisements
Similar presentations
Inference in the Simple Regression Model
Advertisements

1 1 Chapter 5: Multiple Regression 5.1 Fitting a Multiple Regression Model 5.2 Fitting a Multiple Regression Model with Interactions 5.3 Generating and.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
10-3 Inferences.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Day 6 Model Selection and Multimodel Inference
Practical Model Selection and Multi-model Inference using R Modified from on a presentation by : Eric Stolen and Dan Hunt.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Beyond Null Hypothesis Testing Supplementary Statistical Techniques.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
1. Estimation ESTIMATION.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Today Concepts underlying inferential statistics
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
The Scientific Method Chapter 1.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Simple Linear Regression Analysis
Standard error of estimate & Confidence interval.
4.1 Introducing Hypothesis Tests 4.2 Measuring significance with P-values Visit the Maths Study Centre 11am-5pm This presentation.
Practical Model Selection and Multi-model Inference using R Presented by: Eric Stolen and Dan Hunt.
Intro to Biology Purpose: to introduce the recurring themes of this course and describe the mechanisms by which science is explored.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 (?) Multiple explanatory variables.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
The Scientific Method Formulation of an H ypothesis P lanning an experiment to objectively test the hypothesis Careful observation and collection of D.
Lecture 4 Model Selection and Multimodel Inference.
Chapter 1 Measurement, Statistics, and Research. What is Measurement? Measurement is the process of comparing a value to a standard Measurement is the.
BIOL 4605/7220 GPT Lectures Cailin Xu October 12, 2011 CH 9.3 Regression.
1 Virtual COMSATS Inferential Statistics Lecture-16 Ossam Chohan Assistant Professor CIIT Abbottabad.
Statistics (cont.) Psych 231: Research Methods in Psychology.
List the steps of the scientific method. List characteristics of life. What is the difference between growth and development? Place the following terms.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Why Model? Make predictions or forecasts where we don’t have data.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics In HEP Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
Welcome to Biology Mrs. Webster Room 243. List the steps of the scientific method. List characteristics of life. What is the difference between growth.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
How confident are we in the estimation of mean/proportion we have calculated?
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Practical Model Selection and Multi-model Inference using R Presented by: Eric Stolen and Dan Hunt.
Intro to Biology Purpose: to introduce the recurring themes of this course and describe the mechanisms by which science is explored.
PCB 3043L - General Ecology Data Analysis.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 Multiple explanatory variables (10.1,
Capture-recapture Models for Open Populations “Single-age Models” 6.13 UF-2015.
Statistics (cont.) Psych 231: Research Methods in Psychology.
How Many Subjects Will I Need? Jane C. Johnson Office of Research Support A.T. Still University of Health Sciences Kirksville, MO USA.
Inferential Statistics Psych 231: Research Methods in Psychology.
Reasoning in Psychology Using Statistics Psychology
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
PO 141: INTRODUCTION TO PUBLIC POLICY Summer I (2015) Claire Leavitt Boston University.
Why Model? Make predictions or forecasts where we don’t have data.
Some common misconceptions about P-values and confidence intervals
PCB 3043L - General Ecology Data Analysis.
Statistics in MSmcDESPOT
Choosing a test: ... start from thinking whether our variables are continuous or discrete.
CS639: Data Management for Data Science
Presentation transcript:

Introduction to statistical estimation methods Finse Alpine Research Center, September 2010

OUTLINE TODAY: Mostly maximum likelihood Focus of the course: Introduce essential methods for statistical modelling in ecology Construction of biologically sound models Estimation of parameter values and associated uncertainties Interpretation of results Introduce concepts that are important for the course next week TOMORROW: Mostly Bayesian statistics SUNDAY: Day off / Glacier hike MONDAY TO FRIDAY: Occupancy modelling workshop (3 new lecturers – joining on the glacier hike) – Some lectures – Many Exercises – Tutoring – – Help each other– Ask questions – Be active! MATERIAL ON:

Quantification of …: … relationship between variables … differences between groups of individuals … the effect of experimental treatments … predictions for the future (effects of climate change) … of effect of management strategies and not the least: Quantification of uncertainty! Most studies in ecology require quantification in some way: Quantification of anything requires: … some sort of model … ways to estimate parameters / distributions of random variables

Claim: In ecology, the main question is seldom IF something has an effect The questions are more about HOW and HOW MUCH

Energy expenditure (Field metabolic rate) Body mass HABITAT Season Sex Reproductive state Temperature Weather Activity / Behaviour OTHER THINGS (biological things + measurement error) ? Example: How does habitat quality affect energy expenditure?  The question should not be IF these variables have an effect – from biological theory we can be almost certain that all these variables have an effect.  Relationships in ecology are almost infinitely complex (there is no true model)  “All models are wrong, but some are useful” (Box)

Energy expenditure (Field metabolic rate) Body mass HABITAT Season Sex Reproductive state Temperature Weather Activity / Behaviour OTHER THINGS (biological things + measurement error) ? “Typical approach”: 1.Put everything into a linear model (multiple regression) 2.Remove non-significant effects 3.Reporting p-values Without thinking about HOW the various predictor variables can affect the response variables Without thinking about what you are really interested in Without quantifying HOW MUCH the predictor variables affect the response variable, and without thinking about BIOLOGICAL SIGNIFICANCE

Effect size Small Large p<0.001 NSp<0.05NSp<0.05 Biologically significant Not biologically significant Could be important – more data needed Statistical significance vs. biological relevance 5 different confidence intervals: Null-hypothesis tests are often used erroneously to make a classification of “no effect” (not significant) and “significant effect” with no consideration of the potential biological significance (a somewhat thoughtless process). E.g. statements like “Predator density did not affect prey survival” with no further detail on effect size.

Number of papers questioning the utility of null hypothesis testing in scientific research Anderson et al Null-hypothesis testing in ecological science:  Yoccoz, N. G Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:  Anderson, D. R., K. P. Burnham, and W. L. Thompson Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64: Web-page:  Null-hypothesis testing in ecological science:  Yoccoz, N. G Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:  Anderson, D. R., K. P. Burnham, and W. L. Thompson Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64: Web-page: 

= deviance + 2 × no. parameters + small sample correction In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth). Using p-values for model selection is a different thing Bias 2 Variance (uncertainty) Number of parameters (K) Prediction Freq. Truth Too simple model Too complex model

Example: Influence of testosterone on size of home-range in voles. 60 sites … Testosterone treated male Control male Do testosterone treated males have larger home-ranges at high densities? What are the effects at low densities? Response variable:Predictor variables: Home range size measured by radio-telemetry ~ Treatment Density Body mass Think about HOW things are related

F Value Pr(F) Body mass 15.79<0.001 Treatment Density <0.001 Body mass × Treatment Density × Treatment Full model: F Value Pr(F) Body mass <0.001 Treatment Density <0.001 Density × Treatment Step 1: F Value Pr(F) Body mass <0.001 Treatment Density <0.001 Step 2: F Value Pr(F) Body mass <0.001 Density <0.001 Step 3: Conclusion: There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11). Conclusion: There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11). Response variable: home range size

Density Home range size log(Density) log(Home range size) D<c: y = constant D≥c: y = c*(D-c) b D<c: log(y) = constant D≥c: log(y) = log(c) + b*log(D-c) Body mass (M) Home range size log(Body mass) log(Home range size) y = aM b log(y) = a+ b*log(M)

log(Population density) log(Home range size) Treatment + Treatment × Density Treatment + DensityTreatment × Density Candidate models