Presentation is loading. Please wait.

Presentation is loading. Please wait.

Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition.

Similar presentations


Presentation on theme: "Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition."— Presentation transcript:

1 Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition.

2 Overview Why researchers sometimes need to create new variables to conduct their analysis Why it is important to plan ahead for how to create those new variables What information is required to identify the new variables needed for the research question How to write clear instructions on how to get from the variables you have to the variables you need The Chicago Guide to Writing about Numbers, 2nd Edition.

3 Why create new variables? For many statistical analyses, variables available on the original data set are not yet in the form needed to address the research question of interest. Examples: – You want to study total family income, but the data set has separate variables measuring income components such as earned income, government benefits, and alimony. – You want to compare outcomes for age groups (children, working age adults, and the elderly), but the data set reports respondent’s age in single years. The Chicago Guide to Writing about Numbers, 2 nd edition.

4 Conceptualizing the new variable should precede programming it Important to separate – Researching and planning how those variables should be defined – Programming the new variable in an electronic database Each of those tasks – Has its own challenging aspects – Uses different Skills Resources

5 Some common patterns of creating new from existing variables A categorical version of a continuous variable A simplified (collapsed) categorical variable A binary indicator from a continuous variable A new continuous variable that combines 2+ continuous variables A mathematical transformation of a continuous variable The Chicago Guide to Writing about Numbers, 2 nd edition.

6 A categorical version of a continuous variable Original variable – Age in years (continuous) Needed variable – Age group (categorical) The Chicago Guide to Writing about Numbers, 2 nd edition.

7 A simplified (collapsed) categorical variable Original variable – Ten-category ethnicity variable Needed variable – Three-category ethnicity variable The Chicago Guide to Writing about Numbers, 2 nd edition.

8 A binary indicator from a continuous variable Original variable – Birth weight in grams (continuous) Needed variable – Indicator of low birth weight status (yes or no) The Chicago Guide to Writing about Numbers, 2 nd edition.

9 A new continuous variable that aggregates 2+ continuous variables Original variable(s)New variable Separate measures of income for each family member Total family income Multiple attitudinal itemsA composite attitudinal scale The Chicago Guide to Writing about Numbers, 2 nd edition.

10 A new continuous variable calculated from 2+ continuous variables Original variable(s)New variable Separate measure of county-level population and poverty rate Number of poor persons in the county = population × % poor Separate measures of weight (kg.) and height (meters) Body Mass Index = weight/(height 2 ) The Chicago Guide to Writing about Numbers, 2 nd edition.

11 A mathematical transformation of a continuous variable Original variable(s)New variable Income in dollarsLogged income Income in dollarsIncome in thousands of dollars The Chicago Guide to Writing about Numbers, 2 nd edition.

12 Planning steps for creating new variables Finding relevant variables in the original data set Becoming acquainted with the units and categories for available variables Consulting the published literature on the topic to see how those concepts have been measured or classified by other researchers Identifying pertinent formulas and thresholds Writing out the logic or math needed to create the new variables from existing variables The Chicago Guide to Writing about Numbers, 2 nd edition.

13 Steps toward creating a new variable 1.Identify the name(s) of the original variable(s) in the data set that contain the data needed to create the new variable. 2.For the new variable, devise – A name (acronym) to convey Content (meaning) of the new variable The dates or survey rounds when the data were collected, if pertinent – A label (short descriptive phrase) for the new variable Mention units, if pertinent The Chicago Guide to Writing about Numbers, 2 nd edition.

14 For new continuous variables Write the formula to calculate the value of the new variable from the original variables. Specify the units of the original variable(s) and the new variable. The Chicago Guide to Writing about Numbers, 2 nd edition.

15 Example: Calculating course grades from component test scores For a hypothetical college course, the overall course grade is based on three exam scores – Two mid-term exams (EXAM1 and EXAM2) Each scored from 0 to 25 points – A final exam (FINAL) Scored from 0 to 50 points For each student, the instructor wants to calculate – The percentage of questions s/he got correct on exam 1 – Total numeric course grade – Course letter grade, based on standard grade cutoffs The Chicago Guide to Writing about Numbers, 2 nd edition.

16 Calculating percentage of exam questions correct from number of questions correct Logic: From the information in the data set, how does one calculate the percentage of questions correct? Concepts: Percentage of questions correct is number of questions correct divided by the total number of questions on the exam, multiplied by 100. Formula: Replace concepts with names of variables: PCCOREX1 = (EXAM1/25) * 100 STEP 2: name for new variable, not yet in data set. STEP 1: Identify existing variables, already in data set from which new variable will be calculated. STEP 3: Write the mathematical formula The Chicago Guide to Writing about Numbers, 2 nd edition.

17 Creating a variable for total numeric course grade from exam scores Logic: From the information in the data set, how does one calculate total numeric course grade? Concepts: Overall numeric course grade is the sum of the three exam scores. Formula: Replace concepts with names of variables: TOTGRADE = EXAM1 + EXAM2 + FINAL STEP 2: name for new variable, not yet in data set. STEP 1: Identify existing variables, already in data set from which new variable will be calculated. STEP 3: Write the mathematical formula The Chicago Guide to Writing about Numbers, 2 nd edition.

18 For new categorical variables Write the logical steps to classify the values of the original variable into the values of the new variable. Show how every possible value of the original variable maps into a value of the new variable. List the – Value label (descriptive phrase) for each value (category) of the new variable; – Code (numeric value) that the new variable will take on for each value or set of values of the original variable. The Chicago Guide to Writing about Numbers, 2 nd edition.

19 Classifying numeric course grades into letter grade ranges TOTGRADE Variable Label: Numeric course grade  LETTRGRD Variable Label: Final letter grade Values of original variable Values (codes) of new variableValue labels <601F 60 TO 692D 70 TO 793C 80 TO 894B 90 OR HIGHER5A STEP 2: name for new variable, not yet in data set. STEP 1: Identify existing variables from which new variable will be created. STEP 3: Write the logic for classifying the numeric scores into letter grade ranges, based on the university’s standard grade cutoffs. E.g., scores below 60 are classified an “F.”

20 Missing values for the new variable Provide instructions to ensure that cases that have missing values on the original variables will also have missing values for new variables that are based on them. Needed whether the new variable was created using – A formula – Classification instructions The Chicago Guide to Writing about Numbers, 2 nd edition.

21 Summary It is often necessary to create new variables to answer one’s research question. Planning steps for creating new variables include – Identifying source variables available in a data set – Finding references about how such variables are conventionally analyzed – Becoming familiar with units or categories of the variables – Writing formulas or classification instructions to create the new variables from the original variables – Providing instructions about missing values for the original and new variables The Chicago Guide to Writing about Numbers, 2nd Edition.

22 Summary, cont. With the formulas and classification instructions for creating the new variables, one can then use a spreadsheet or statistical software to create those variables within an electronic data set. Separate – The researching and planning steps – The programming steps The Chicago Guide to Writing about Numbers, 2 nd edition.

23 Suggested resources Miller, J. E. 2015. The Chicago Guide to Writing about Numbers, 2nd Edition. University of Chicago Press, chapter 10. The Chicago Guide to Writing about Numbers, 2 nd edition.

24 Suggested practice exercises The Chicago Guide to Writing about Numbers, 2nd Edition. NAME of original variable ______________________ LABEL for original variable ______________________  NAME of new variable _______________________  LABEL for new variable _______________________ Values of original variableValues (codes) of new variable Value labels of new variable Instructions and a planning template can be downloaded from the supplemental online materials at http://press.uchicago.edu/books/miller/numbers/index.htm http://press.uchicago.edu/books/miller/numbers/index.htm

25 Suggested online appendixes How to Create the Variables You Need from the Variables You Have – Exercise includes Step-by-step instructions A template planning grid for a new categorical variable – Paper for instructors on how to teach the concepts and skills Getting to Know Your Variables – Exercise to familiarize researchers with the concepts, units, categories of variables in their data set – Paper for instructors on how to teach the concepts and skills The Chicago Guide to Writing about Numbers, 2nd Edition.

26 Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/numbers/index.html The Chicago Guide to Writing about Numbers, 2nd Edition.


Download ppt "Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition."

Similar presentations


Ads by Google