Dependent and Independent Variables, and Distributions

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
Advertisements

Quantitative Techniques Lecture 1: Economic data 30 September 2004.
Brief introduction on Logistic Regression
© 2004 Prentice-Hall, Inc.Chap 5-1 Basic Business Statistics (9 th Edition) Chapter 5 Some Important Discrete Probability Distributions.
© 2003 Prentice-Hall, Inc.Chap 5-1 Basic Business Statistics (9 th Edition) Chapter 5 Some Important Discrete Probability Distributions.
Introduction to Categorical Data Analysis
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Inferences About Means of Two Independent Samples Chapter 11 Homework: 1, 2, 3, 4, 6, 7.
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Slide 1 Statistics Workshop Tutorial 7 Discrete Random Variables Binomial Distributions.
A primer in Biostatistics
BASIC STATISTICS WE MOST OFTEN USE Student Affairs Assessment Council Portland State University June 2012.
Statistical Methods II
Jargon & Basic Concepts Howell Statistical Methods for Psychology.
10/3/20151 PUAF 610 TA Session 4. 10/3/20152 Some words My –Things to be discussed in TA –Questions on the course and.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Binomial Probability Distribution
Review of Chapters 1- 6 We review some important themes from the first 6 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Logistic Regression. Conceptual Framework - LR Dependent variable: two categories with underlying propensity (yes/no) (absent/present) Independent variables:
Research Design 10/16/2012. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (pp ) Chapter 5 Making Controlled.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Chapter 4: Introduction to Predictive Modeling: Regressions
General Business 704 Data Analysis for Managers Introduction The Course, Data, and Excel.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Stats: Getting Started
Multivariate Statistics Psy 524 Andrew Ainsworth.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Introduction To Statistics
Introduction to Quantitative Research
EMPA Statistical Analysis
Basics of Statistics.
Elementary Statistics
Independent t-Test PowerPoint Prepared by Alfred P. Rovai
26134 Business Statistics Week 5 Tutorial
Statistical Analysis Urmia University
Reasoning in Psychology Using Statistics
8.DATA DESCRIPTIVE.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 16: Research with Categorical Data.
Basics of Statistics.
CHOOSING A STATISTICAL TEST
Multiple Regression.
Introduction To Statistics
Nonparametric Statistical Methods: Overview and Examples
Advanced Quantitative Analysis
Jargon & Basic Concepts
Finding Answers through Data Collection
Parametric and non parametric tests
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Statistical Modelling
Nonparametric Statistical Methods: Overview and Examples
Basic Statistical Terms
Nonparametric Statistical Methods: Overview and Examples
Vocabulary of Statistics
The Binomial and Geometric Distributions
Building Models: Mediation and Moderation Analysis
Nonparametric Statistical Methods: Overview and Examples
Types of Variable You will have to identify what the variables are and also choose which category of variable it belongs to. Friday, December 07, 2018Friday,
Classification of Variables
Nonparametric Statistical Methods: Overview and Examples
What does this problem equal?
Introduction To Statistics
Wrap-up and Course Review
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Reasoning in Psychology Using Statistics
Learning outcomes By the end of this session you should know about:
Reasoning in Psychology Using Statistics
Introductory Statistics
Presentation transcript:

Dependent and Independent Variables, and Distributions Dr. Yang Hu yang.hu@Lancaster.ac.uk Dr. Stuart Bedston s.bedston@lancaster.ac.uk

Session overview What is a dependent variable (DV)? From outcome of interest to DV What is an independent variable (IV)? From explanatory factors/concepts to IV. Preparing and recoding the variables for modelling

The underlying equation Example from the introduction session Y (worry) = b*X(age) + a (constant) Y= Dependent Variable (Question level of (how much) worry? X=Independent variable/predictor (Y depends on age) If we have multiple variables Y (worry) = b1*X1(age) + b2*X2(gender) + b3*X3(ethnicity) + a (constant)

Dependent and independent variables

Outcome of interest? Dependent Variables depend on what the independent variables are and their values. How do we determine DV? Concept-driven approach: Devise and find the best possible measurement for a concept in which we are interested Data-driven approach: What questions can be answered using available measures

Outcome of interest? Group work Based on the list of variables in the user manual choose a DV and devise a research question based on the DV of your choice.

Types of variable distribution In the BCS unrestricted teaching dataset Nominal Scale Ordinal Other common possibilities Count Rate The distribution matters as it determines the choice of model (next session)

Nominal categorical variable Formed of unordered, mutually exclusive categories Binomial (e.g. sex, but…) Multinomial (e.g. marital status)

Scale/continuous numeric variable Continuous (or can be conceptualised as such) scalar variable (e.g. age, level of worry about being victim of personal crime)

Ordinal variable Discrete categories following a sequence of order (e.g. how much crime rate has changed in this area since 2 years ago?)

Count variable Every event/occurrence/incidence counts as 1 unit, the variable summarises the total number of units in a given frame—important to define both unit and frame (e.g. number of crime in given countries, regions, etc.)

Rate variable Number counts per XYZ population—important to define population baseline and unit (e.g. murder per million people by country)

Group activity Group work and classify the 35 variables in the BCS example dataset by their type of distribution

Distribution matters for both independent and dependent variable Independent variable: Distribution affects the meaning and interpretation of the explanations of interest. Dependent variable: Distribution affects the choice of which type of model/equation is appropriate. Any type of measure can be both DV and IV in different contexts – it depends on the research question and the explanatory framework. DV = usually one at a time IV = can be one or many But…IV needs to be in the model for a reason – conceptual, statistical, or both.

Coding/data preparation beyond data cleaning Prepare variables for statistical modeling Extreme values/influential cases Small categories and statistical power/stability of estimation Distribution that violates statistical assumptions underlying a given model specification (to be covered in Session 4)

Recoding/preparing variables beyond data cleaning Continuous variable Example: Age  worry about being victim of personal crime: Age has a long-tail on the right-hand side.

Recoding/preparing variables beyond data cleaning Continuous variable Example: Age  Worry about being victim of personal crime: Age (16~101 original distribution)

Recoding/preparing variables beyond data cleaning Continuous variable Example: Age  Worry about being victim of personal crime: Age (delete the top 3% percentile [>=84])

Recoding/preparing variables beyond data cleaning Continuous Example: Age  Worry about being victim of personal crime: Age (replace the top 3% percentile to be equal to the 97th percentile [=83])

Recoding/preparing variables beyond data cleaning From continuous variable to categorical variable Example: Age or cohort? Cohort/generation replacement may be the concept of interest rather than age (life course dynamics/life stage); non-linear age pattern?

Recoding/preparing variables beyond data cleaning From continuous variable to categorical variable Example: Age or cohort? Cohort/generation replacement may be the concept of interest rather than age (life course dynamics/life stage); non-linear age pattern?

Recoding/preparing variables beyond data cleaning Categorical: marital status

Group work How would you potentially prepare the variables? One group work on nominal variables One group work on scale variables One group work on ordinal variables