Download presentation
Presentation is loading. Please wait.
Published byMartha Daniel Modified over 6 years ago
1
Dependent and Independent Variables, and Distributions
Dr. Yang Hu Dr. Stuart Bedston
2
Session overview What is a dependent variable (DV)?
From outcome of interest to DV What is an independent variable (IV)? From explanatory factors/concepts to IV. Preparing and recoding the variables for modelling
3
The underlying equation
Example from the introduction session Y (worry) = b*X(age) + a (constant) Y= Dependent Variable (Question level of (how much) worry? X=Independent variable/predictor (Y depends on age) If we have multiple variables Y (worry) = b1*X1(age) + b2*X2(gender) + b3*X3(ethnicity) + a (constant)
4
Dependent and independent variables
5
Outcome of interest? Dependent Variables depend on what the independent variables are and their values. How do we determine DV? Concept-driven approach: Devise and find the best possible measurement for a concept in which we are interested Data-driven approach: What questions can be answered using available measures
6
Outcome of interest? Group work Based on the list of variables in the user manual choose a DV and devise a research question based on the DV of your choice.
7
Types of variable distribution
In the BCS unrestricted teaching dataset Nominal Scale Ordinal Other common possibilities Count Rate The distribution matters as it determines the choice of model (next session)
8
Nominal categorical variable
Formed of unordered, mutually exclusive categories Binomial (e.g. sex, but…) Multinomial (e.g. marital status)
9
Scale/continuous numeric variable
Continuous (or can be conceptualised as such) scalar variable (e.g. age, level of worry about being victim of personal crime)
10
Ordinal variable Discrete categories following a sequence of order (e.g. how much crime rate has changed in this area since 2 years ago?)
11
Count variable Every event/occurrence/incidence counts as 1 unit, the variable summarises the total number of units in a given frame—important to define both unit and frame (e.g. number of crime in given countries, regions, etc.)
12
Rate variable Number counts per XYZ population—important to define population baseline and unit (e.g. murder per million people by country)
13
Group activity Group work and classify the 35 variables in the BCS example dataset by their type of distribution
14
Distribution matters for both independent and dependent variable
Independent variable: Distribution affects the meaning and interpretation of the explanations of interest. Dependent variable: Distribution affects the choice of which type of model/equation is appropriate. Any type of measure can be both DV and IV in different contexts – it depends on the research question and the explanatory framework. DV = usually one at a time IV = can be one or many But…IV needs to be in the model for a reason – conceptual, statistical, or both.
15
Coding/data preparation beyond data cleaning
Prepare variables for statistical modeling Extreme values/influential cases Small categories and statistical power/stability of estimation Distribution that violates statistical assumptions underlying a given model specification (to be covered in Session 4)
16
Recoding/preparing variables beyond data cleaning
Continuous variable Example: Age worry about being victim of personal crime: Age has a long-tail on the right-hand side.
17
Recoding/preparing variables beyond data cleaning
Continuous variable Example: Age Worry about being victim of personal crime: Age (16~101 original distribution)
18
Recoding/preparing variables beyond data cleaning
Continuous variable Example: Age Worry about being victim of personal crime: Age (delete the top 3% percentile [>=84])
19
Recoding/preparing variables beyond data cleaning
Continuous Example: Age Worry about being victim of personal crime: Age (replace the top 3% percentile to be equal to the 97th percentile [=83])
20
Recoding/preparing variables beyond data cleaning
From continuous variable to categorical variable Example: Age or cohort? Cohort/generation replacement may be the concept of interest rather than age (life course dynamics/life stage); non-linear age pattern?
21
Recoding/preparing variables beyond data cleaning
From continuous variable to categorical variable Example: Age or cohort? Cohort/generation replacement may be the concept of interest rather than age (life course dynamics/life stage); non-linear age pattern?
22
Recoding/preparing variables beyond data cleaning
Categorical: marital status
23
Group work How would you potentially prepare the variables?
One group work on nominal variables One group work on scale variables One group work on ordinal variables
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.