Download presentation
Presentation is loading. Please wait.
Published byPaulina Newton Modified over 8 years ago
1
Saving Everyone’s Time and Energy: Practical Tips for Database Design Cynthia Wilson Garvan PhD Statistics, MA Mathematics College of Nursing cgarvan@ufl.edu
2
Outline Know where you are going Tables and figures for your manuscript or report Consort diagram Data Dictionaries What statistical software likes The Basics: Static and Dynamic Datasets Linking datasets Bad examples Things that can go wrong Choosing the right tool: EXCEL, ACCESS, Qualtrics, REDCap 2
3
What questions do you have about databases? 3
4
Know where you are going 4
5
Consort Diagram 5
6
Data Dictionary 6
7
Example Data Dictionary_Fatigue_Study.doc 7
8
What statistical software likes 8
9
The cervical data
10
Setting up database IDgenderagelevel34level45level56level67 341 0100 422330011 121231010 1022261101 861470010
11
BAD EXAMPLE # IDgender of patientagelocation 34f 45 42male3356,67 12F2334,56 102M2634,45,67 86woman4756
12
Tips Top row contains variable names Software has naming conventions Lists are helpful (e.g., Q1 – Q20) Short but meaningful very helpful Variable names spelled correctly and should be consistent 12
13
Rules of SAS Use 1.All variable and data set names must start with a letter or an underscore (_). 2.Names can contain only letters, numerals, and the underscore. No %$!*&#@. 3.All variable and data set names must be thirty-two (32) or fewer characters in length (8 characters in SAS V6.12 or lower).
14
The Basics: Static and Dynamic Datasets 14 Set up static and dynamic datasets separately!
15
Static Dataset One record per subject Static dataset contains data that will not change over time 15
16
Dynamic Dataset Multiple records possible for each subject Must contain identifying variable and variable to indicate rep (e.g., date) 16
17
Linking datasets 17
18
Cervical Data from Another Hospital IDgenderagelevel34level45level56level67 2032440100 2232320111 2981870010 2172561001
19
Linking Data Sets with the same structure SET Stack data Common variables DS_1 DS_2 DS_1 DS_2 Data Set #1 Data Set #2...... DATA new_ds; SET ds_1 ds_2 ; RUN; If a variable exists only in one of the data sets it will be assigned as missing for part of the final data set.......
20
Different type of cervical data IDinjection 34 1 2 3 42 1 2 3 102 1 2 86 1 2 3 4
21
Linking Data Sets with different structures MERGEAdd variables New variables Common ID DS_1 DS_2 DS_1DS_2 Data Set #1 Data Set #2 DATA new_ds; MERGE ds_1 ds_2 ; BY ID ; RUN; Data sets must first be sorted by Common ID. 123123 123123 123123 123123
22
Bad examples 22
23
Bad examples Variable names too long Static and dynamic datasets mixed in No context for variable name Same variable in two datasets that will be merged together Variable is number in one dataset and character in another (BIG SAS error) Over time, variable values are different, questions asked a different way 23
24
Things that can go wrong 24
25
Duplicated records Database isn’t set up to identify missing data, refused data, don’t know data, or not applicable 25
26
Choosing the right tool: EXCEL, ACCESS, SPSS, Qualtrics, Survey Monkey, REDCap 26
27
27 Easy to restrict variable values Easy to learn Free for all UF users Data Entry Screen looks like form Good tool for sending surveys Branch logicBacked up in Cloud? Appropriate for complex database? Generates data dictionary UF technical support available Excel NOYES NO YES (HARD) NO Access YESNOYES NOYESNOYESNO SPSS NOYESNO YESNO Qualtrics YES NO YES Survey Monkey YES NOYES NO REDCap YESNOYES
28
Tips for Success 28
29
Review your database with statistician Pilot test yourself Pilot test with data entry person Enter some mock data and mock analyze your data Make a plan to regularly review data Try to think about what could go wrong – it often does! 29
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.