Download presentation
Presentation is loading. Please wait.
1
Programming Standards and Practices
Bringing Order to the Chaos of Code Michael Swetz & Beverly Musick
2
First Steps Problem definition
State the problem in non-technical terms “The researchers are having difficulty enrolling women into their study and want more information on how currently enrolled female subjects were recruited.” vs. “We need an ODS report summarizing active subjects by race, gender, age, county of residence, and referral source.” Don’t pre-suppose what your solution is going to be.
3
First Steps Define your requirements Specify your outputs
Layout any reports you want to create Specify your inputs What variables do I need? What variables do I have and where are they? What are the ranges the variables take? What are my assumptions about my inputs? (E.g. Assuming unique visit dates for a given subject. Verify your assumptions as necessary.)
4
First Steps Break your requirements down into logical sections that guide your program organization towards meeting your overall goal Consider separate sections for: gathering your inputs, merging them together, computing summary statistics from the merged datasets, and creating reports These can be separate programs if necessary Consider where you may re-use code between sections
5
Laying out the Program Name your program
The name should clearly define the purpose of the program. E.g. “CreateDemographicsReport”, “ImportSubjectRegistry”, “FlagMissingDemog”, etc. If you can’t come up with a good name (you keep coming back to “HandleStuff”, “MergeDatasets”, “Combo”, or “Program 1”), odds are that you haven’t carefully defined the purpose of the program.
6
Laying out the Program Standard header should include:
name of the program description of the program’s general purpose authors name date program created the inputs that the program uses the outputs that the program creates any general usage notes such as update database closure date before running a history of changes and who made them
7
Sample Header /* Program Name: Description: Author(s): Date Created: Inputs: Outputs: Usage Notes: Calls macros: Modifications (Date, Author, Comments): */ We may develop a Biostatistics standard header template.
8
Laying out the Program Lay out the entire program using Program Design Language (PDL) PDL Step by step ENGLISH comments about what you’re trying to do. E.g. Get list of pediatric subjects and their most recent diagnosis. Written at the level of intent. NOT “Use Proc SQL to connect to the database and retrieve the table tblSubjectRegistry where age < 13”
9
Laying out the Program PDL (Continued)
Should be detailed enough so that someone who doesn’t know the computing language could read the PDL and understand what the program does. Note any idiosyncrasies of the data. E.g. “This variable was imported from a legacy system. Unfortunately, it was never renamed and although still called “Visdate”, it actually represents the date the web form was filled out.”
10
Laying out the Program Review the PDL to make sure it achieves the aims of the program and that it uses all the data you plan to bring in. * Show example of well indented program and poorly indented one.
11
Begin Programming Standards Practicum
Write header and PDL * Show example of well indented program and poorly indented one.
12
Adding the Code Write the actual code between the lines of PDL
Review and submit in sections to confirm correctness of your logic and syntax before moving to next section. * Show example of well indented program and poorly indented one.
13
Reviewing the Program Read and mentally test your code
Will missing/zero values cause your computations to fail? How sure am I about the assumptions I made about the data and what happens if I’m wrong? Are the merge relationships correctly defined and based on non-null fields? Did I bring in any data that I don’t need? Am I exporting only what I need to meet my requirements?
14
Submitting the Program
Run the program Review your log and output Do the record counts after a merge or subsetting conditional statement make sense? Were any warnings or error messages produced? Does my output look like what I expected? Debug as necessary
15
Continue Programming Standards Practicum
Write code to extract male patients * Show example of well indented program and poorly indented one.
16
Why bother? Helps collect your thoughts and organize your program.
You’ll code more efficiently. You may discover you don’t have adequate data to address a given issue before you get too far into it. Documents your code as you go along. Makes you focus on what you need to do, not just what you know how to do. Eases transfer of responsibilities.
17
Anticipated Complaints
I know what my code does and comments are just redundant. Good comments aren’t redundant. Will you remember what every piece of code you wrote does in 6 months, a year? How much time do you want to spend explaining your code to the person who takes over your project?
18
Anticipated Complaints
I just don’t have time for this. If you’re writing a program, you’re doing program design whether you realize it or not. This isn’t as elaborate as it sounds. Essentially all you’re doing is jotting down a few lines about what data you have, what you need to do with it, and what you’re trying to achieve. You’ll save time on code maintenance, re-use, and updates. Good organization makes it a lot easier to get help if you need it.
19
General Rules of Thumb Narrowly define your program’s purpose. Resist the temptation to do everything in one place or combine two programs just because there is some common code or data between them. Do not re-use variable names if the variable has been changed. It makes it less obvious that a value has been altered since being taken from the data source.
20
General Rules of Thumb Do not duplicate code across programs. You may not like it, but it’s well worth the time to create a macro that can be shared across programs. Comment, comment, comment. Have logical divisions within your programs. No random walk coding. Indent your code.
21
General Rules of Thumb Link any output back to the program that created it. Label permanent SAS datasets, put footers in reports and logically store files If a program creates a permanent dataset and will be run repeatedly, provide a mechanism to run the code without overwriting the dataset automatically.
22
General Rules of Thumb Code defensively
Avoid “- -” when defining arrays Avoid “:” when dropping or keeping variables. E.g. “Drop vars:” Explicitly convert datatype when necessary. Don’t hardcode constants, use Macro variables. Avoid hardcoded data changes Check for unexpected conditions Don’t use implicit runs Always list the dataset in every procedure.
23
Complete Programming Standards Practicum
* Show example of well indented program and poorly indented one.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.