Art of modelling DEB course 2013 Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam Texel, 2013/04/16.

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Brief introduction on Logistic Regression
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Ch11 Curve Fitting Dr. Deshi Ye
Elementary hypothesis testing
Methodology in quantitative research Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam master course.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Reserve dynamics & social interactions in feeding Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam
Modelling & model criteria Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam master course WTC.
Sample size computations Petter Mostad
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Basic Methods in Theoretical Biology 1 Methodology 2 Mathematical toolkit 3 Models for processes 4 Model-based statistics
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Sampling Distributions
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
The use of models in biology Bas Kooijman Afdeling Theoretische Biologie Vrije Universiteit Amsterdam Eindhoven,
Chapter Sampling Distributions and Hypothesis Testing.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Inference about a Mean Part II
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
Testing models against data Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam master course WTC.
From developmental energetics to effects of toxicants: a story born of zebrafish and uranium S. Augustine B.Gagnaire C. Adam-Guillermin S. A. L. M. Kooijman.
Today Concepts underlying inferential statistics
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Standard error of estimate & Confidence interval.
AM Recitation 2/10/11.
Overview Definition Hypothesis
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Mid-Term Review Final Review Statistical for Business (1)(2)
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Conceptual Modelling and Hypothesis Formation Research Methods CPE 401 / 6002 / 6003 Professor Will Zimmerman.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Academic Research Academic Research Dr Kishor Bhanushali M
Review of statistical modeling and probability theory Alan Moses ML4bio.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Scientific Method Vocabulary Observation Hypothesis Prediction Experiment Variable Experimental group Control group Data Correlation Statistics Mean Distribution.
Modeling and Simulation CS 313
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Modeling and Simulation CS 313
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 2 Simple Comparative Experiments
Statistical inference: distribution, hypothesis testing
Central Limit Theorem, z-tests, & t-tests
The use of models in biology
6-1 Introduction To Empirical Models
Simple Linear Regression
Introductory Statistics
Presentation transcript:

Art of modelling DEB course 2013 Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam Texel, 2013/04/16

Modelling 1 model : scientific statement in mathematical language “all models are wrong, some are useful” aims : structuring thought; the single most useful property of models: “a model is not more than you put into it” how do factors interact? (machanisms/consequences) design of experiments, interpretation of results inter-, extra-polation (prediction) decision/management (risk analysis)

Modelling 2 language errors : mathematical, dimensions, conservation laws properties : generic (with respect to application) realistic (precision) simple (math. analysis, aid in thinking) plasticity in parameters (support, testability) ideals : assumptions for mechanisms (coherence, consistency) distinction action variables/meausered quantities core/auxiliary theory

Causation Cause and effect sequences can work in chains A  B  C But are problematic in networks A B C Framework of dynamic systems allow for holistic approach

Dynamic systems Defined by simultaneous behaviour of input, state variable, output Supply systems: input + state variables  output Demand systems input  state variables + output Real systems: mixtures between supply & demand systems Constraints: mass, energy balance equations State variables: span a state space behaviour: usually set of ode’s with parameters Trajectory: map of behaviour state vars in state space Parameters: constant, functions of time, functions of modifying variables compound parameters: functions of parameters

Empirical cycle

Modelling criteria Consistency dimensions, conservation laws, realism (consistency with data) Coherence consistency with neighbouring fields of interest, levels of organisation Efficiency comparable level of detail, all vars and pars are effective numerical behaviour Testability amount of support, hidden variables

Dimension rules quantities left and right of = must have equal dimensions + and – only defined for quantities with same dimension ratio’s of variables with similar dimensions are only dimensionless if addition of these variables has a meaning within the model context never apply transcendental functions to quantities with a dimension log, exp, sin, … What about pH, and pH 1 – pH 2 ? don’t replace parameters by their values in model representations y(x) = a x + b, with a = 0.2 M -1, b = 5  y(x) = 0.2 x + 5 What dimensions have y and x? Distinguish dimensions and units!

Model without dimension problem Arrhenius model: ln k = a – T 0 /T k: some rate T: absolute temperature a: parameter T 0 : Arrhenius temperature Alternative form: k = k 0 exp{1 – T 0 /T}, with k 0 = exp{a – 1} Difference with allometric model: no reference value required to solve dimension problem T -1 ln rate Arrhenius plot

Models with dimension problems Allometric model: y = a W b y: some quantity a: proportionality constant W: body weight b: allometric parameter in (2/3, 1) Usual form ln y = ln a + b ln W Alternative form: y = y 0 (W/W 0 ) b, with y 0 = a W 0 b Alternative model: y = a L 2 + b L 3, where L  W 1/3 Freundlich’s model: C = k c 1/n C: density of compound in soil k: proportionality constant c: concentration in liquid n: parameter in (1.4, 5) Alternative form: C = C 0 (c/c 0 ) 1/n, with C 0 = kc 0 1/n Alternative model: C = 2C 0 c(c 0 +c) -1 (Langmuir’s model) Problem: No natural reference values W 0, c 0 Values of y 0, C 0 depend on the arbitrary choice

Allometric functions Length, mm O 2 consumption, μl/h Two curves fitted: a L 2 + b L 3 with a = μl h -1 mm -2 b = μl h -1 mm -3 a L b with a = μl h -1 mm b = 2.437

Kleber’s law O 2 consumption  weight 3/4 O 2 consumption has contributions from maintenance & development overheads of assimilation, growth & reproduction These are all functions of weight that should be added But : sum of functions of weight  allometric function of weight Problem in relating respiration to other activities

Egg development time Bottrell, H. H., Duncan, A., Gliwicz, Z. M., Grygierek, E., Herzig, A., Hillbricht-Ilkowska, A., Kurasawa, H. Larsson, P., Weglenska, T A review of some problems in zooplankton production studies. Norw. J. Zool. 24:

molecule cell individual population ecosystem system earth time space Space-time scales When changing the space-time scale, new processes will become important other will become less important Individuals are special because of straightforward energy/mass balances Each process has its characteristic domain of space-time scales

Complex models hardly contribute to insight hardly allow parameter estimation hardly allow falsification Avoid complexity by delineating modules linking modules in simple ways estimate parameters of modules only

Biodegradation of compounds n-th order modelMonod model ; ; X : conc. of compound, X 0 : X at time 0 t : time k : degradation rate n : order K : saturation constant

Biodegradation of compounds n-th order modelMonod model scaled time scaled conc.

Plasticity in parameters If plasticity of shapes of y(x|a) is large as function of a: little problems in estimating value of a from {x i,y i } i (small confidence intervals) little support from data for underlying assumptions (if data were different: other parameter value results, but still a good fit, so no rejection of assumption)

Stochastic vs deterministic models Only stochastic models can be tested against experimental data Standard way to extend deterministic model to stochastic one: regression model: y(x| a,b,..) = f(x|a,b,..) + e, with e  N(0,  2 ) Originates from physics, where e stands for measurement error Problem: deviations from model are frequently not measurement errors Alternatives: deterministic systems with stochastic inputs differences in parameter values between individuals Problem: parameter estimation methods become very complex

Statistics Deals with estimation of parameter values, and confidence in these values tests of hypothesis about parameter values differs a parameter value from a known value? differ parameter values between two samples? Deals NOT with does model 1 fit better than model 2 if model 1 is not a special case of model 2 Statistical methods assume that the model is given (Non-parametric methods only use some properties of the given model, rather than its full specification)

Large scatter complicates parameter estimation complicates falsification Avoid large scatter by Standardization of factors that contribute to measurements Stratified sampling

Kinds of statistics Descriptive statistics sometimes useful, frequently boring Mathematical statistics beautiful mathematical construct rarely applicable due to assumptions to keep it simple Scientific statistics still in its childhood due to research workers being specialised upcoming thanks to increase of computational power (Monte Carlo studies)

Nested models Venn diagram

Error of the first kind: reject null hypothesis while it is true Error of the second kind: accept null hypothesis while the alternative hypothesis is true Level of significance of a statistical test:  = probability on error of the first kind Power of a statistical test:  = 1 – probability on error of the second kind Testing of hypothesis truefalse accept 1 -  reject  1 -  null hypothesis decision No certainty in statistics

Statements to remember “proving” something statistically is absurd if you do not know the power of your test, do don’t know what you are doing while testing you need to specify the alternative hypothesis to know the power this involves knowledge about the subject (biology, chemistry,..) parameters only have a meaning if the model is “true” this involves knowledge about the subject

Independent observations IIfIIf If X and Y are independent

Central limit theorems The sum of n independent identically (i.i.) distributed random variables becomes normally distributed for increasing n. The sum of n independent point processes tends to behave as a Poisson process for increasing n. Number of events in a time interval is i.i. Poisson distributed Time intervals between subsequent events is i.i. exponentially distributed

Sums of random variables Exponential prob dens Poisson prob

Normal probability density

Parameter estimation Most frequently used method: Maximization of (log) Likelihood likelihood: probability of finding observed data (given the model), considered as function of parameter values If we repeat the collection of data many times (same conditions, same number of data) the resulting ML estimate

Profile likelihood large sample approximation 95% conf interval

Comparison of models Akaike Information Criterion for sample size n and K parameters in the case of a regression model You can compare goodness of fit of different models to the same data but statistics will not help you to choose between the models

Confidence intervals parameter estimate excluding point 4 sd excluding point 4 estimate including point 4 sd including point 4 L , mm r B,d time, d length, mm 95% conf intervals correlations among parameter estimates can have big effects on sim conf intervals excludes point 4 includes point 4

These gouramis are from the same nest, they have the same age and lived in the same tank Social interaction during feeding caused the huge size difference Age-based models for growth are bound to fail; growth depends on food intake : These gouramis are from the same nest, they have the same age and lived in the same tank Social interaction during feeding caused the huge size difference Age-based models for growth are bound to fail; growth depends on food intake No age, but size: Trichopsis vittatus

Rules for feeding

time reserve density length time 1 ind 2 ind determin expectation Social interaction  Feeding

Dependent observations Conclusion Dependences can work out in complex ways The two growth curves look like von Bertalanffy curves with very different parameters But in reality both individuals had the same parameters!