SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Analysis of Variance for comparing means (Session 11)
Advertisements

SADC Course in Statistics Multiple Linear Regresion: Further issues and anova results (Session 07)
SADC Course in Statistics Analysis of Variance with two factors (Session 13)
SADC Course in Statistics Simple Linear Regression (Session 02)
Assumptions underlying regression analysis
SADC Course in Statistics The binomial distribution (Session 06)
SADC Course in Statistics Importance of the normal distribution (Session 09)
SADC Course in Statistics Revision of key regression ideas (Session 10)
Correlation & the Coefficient of Determination
SADC Course in Statistics Confidence intervals using CAST (Session 07)
SADC Course in Statistics Sampling design using the Paddy game (Sessions 15&16)
SADC Course in Statistics Decomposing a time series (Session 03)
SADC Course in Statistics Multi-stage sampling (Sessions 13&14)
SADC Course in Statistics Processing single and multiple variables Module I3 Sessions 6 and 7.
SADC Course in Statistics Session 4 & 5 Producing Good Tables.
SADC Course in Statistics Graphical summaries for quantitative data Module I3: Sessions 2 and 3.
SADC Course in Statistics Comparing two proportions (Session 14)
SADC Course in Statistics Review and further practice (Session 10)
SADC Course in Statistics Overview of Sampling Methods I (Session 03)
SADC Course in Statistics General approaches to sample size determinations (Session 12)
SADC Course in Statistics To the Woods discussion (Sessions 10)
SADC Course in Statistics Review of ideas of general regression models (Session 15)
SADC Course in Statistics Case Study Work (Sessions 16-19)
SADC Course in Statistics Setting the scene (Session 01)
SADC Course in Statistics A model for comparing means (Session 12)
SADC Course in Statistics Sampling methods in practice Module I1, Sessions 06 and 07.
SADC Course in Statistics Revision on tests for proportions using CAST (Session 18)
SADC Course in Statistics Good graphs & charts using Excel Module B2 Sessions 6 & 7.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
SADC Course in Statistics Analysing numeric variables Module B2, Session 15.
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
SADC Course in Statistics Comparing Regressions (Session 14)
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Analysis of Complex Survey Data Day 3: Regression.
SADC Course in Statistics Producing Good Tables In Excel Module B2 Sessions 4 & 5.
SADC Course in Statistics Paddy results: a discussion (Session 17)
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Mother and Child Health: Research Methods G.J.Ebrahim Editor Journal of Tropical Pediatrics, Oxford University Press.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Linear correlation and linear regression + summary of tests
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Data Analysis.
Chapter 12: Correlation and Linear Regression 1.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Chapter 12: Correlation and Linear Regression 1.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Statistical Data Analysis - Lecture /04/03
(Residuals and
Simple Linear Regression - Introduction
Introduction to logistic regression a.k.a. Varbrul
CHAPTER 29: Multiple Regression*
Unit 3 – Linear regression
Undergraduated Econometrics
The greatest blessing in life is
Regression is the Most Used and Most Abused Technique in Statistics
Presentation transcript:

SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)

To put your footer here go to View > Header and Footer 2 Objectives The aim of this session is to provide you with an appreciation of approaches available to deal with modelling variables that are not in the form of quantitative measurements

To put your footer here go to View > Header and Footer 3 Contents A brief overview of modelling ideas in general Emphasis is on different analysis approaches to cater to the different types of response being modelled and an appreciation of the standard form of a model in terms of data=pattern+residual Presentation of the case study work will then follow…

To put your footer here go to View > Header and Footer 4 Steps in Modelling Exploratory stage Comparing competing models Fitting the chosen model Checking model assumptions Interpreting model Presenting the results. Always want as simple a model as possible, but one that describes all the pattern.

To put your footer here go to View > Header and Footer 5 data = pattern + residual e.g. paddy survey data yield Linear relationship with amount of fertiliser (continuous variable) Yield differs from variety to variety (grouping variable) Known or possible explanatory variables Statistical Models

To put your footer here go to View > Header and Footer 6 data = pattern + residual Known or possible explanatory variables Statistical Models Use this component to check model assumptions, e.g. plots of residuals, or a histogram for quantitative data (e.g. paddy yield) When data are quantitative, assume residuals have a normal distribution with a constant variance.

To put your footer here go to View > Header and Footer 7 data = pattern + residual Known or possible explanatory variables Statistical Models Need to consider this when the data have a hierarchical structure e.g. plants within pots, & leaves within plants e.g. households and individuals within hhs Different levels of variation require moving to more advanced procedures such as Multilevel Modelling with more than one residual

To put your footer here go to View > Header and Footer 8 data = pattern + residual Generalised Linear Models Not all data are quantitative measurements; e.g. often interested in proportions (or %s) e.g. or the response may be in the form of counts. Moving to modern methods, i.e. generalised linear models. In these models, the residuals have non-normal distributions.

To put your footer here go to View > Header and Footer 9 data = pattern + residual Logistic/Log-linear Models Logistic modelling is used to model data that are binary, i.e. only 2 categories The response being modelled is the log odds of getting response=yes. Log-linear modelling is suitable for use when dealing with categorical data having more than two categories.

To put your footer here go to View > Header and Footer 10 In summary… The appropriate model depends on the data type for your key response measurement. With quantitative measurements – use standard regression/anova type models –normal distribution assumed –If skewed data, consider taking a transformation For binary data, use logistic regression models For categorical variates (more than two categories) use poisson regression or log- linear models.

To put your footer here go to View > Header and Footer 11 Case Study Presentation will follow…