database-dump.sql OpenMRS reporting tool – Typically generates Excel Spreadsheets ODBC connection to SAS"> database-dump.sql OpenMRS reporting tool – Typically generates Excel Spreadsheets ODBC connection to SAS">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using AMRS Data in Research September 15, 2008 Beverly Musick Indiana University, Division of Biostatistics.

Similar presentations


Presentation on theme: "Using AMRS Data in Research September 15, 2008 Beverly Musick Indiana University, Division of Biostatistics."— Presentation transcript:

1 Using AMRS Data in Research September 15, 2008 Beverly Musick Indiana University, Division of Biostatistics

2 Using AMRS Data in Research Extracting data from MySQL database Conversion to SAS datasets Processing and cleaning of data for specific research question Preliminary analysis Example

3 Extracting data from AMRS (MySQL database) Mysqldump command – C:\>"c:\Program Files\Mysql\Mysql Server 5.0\bin\mysqldump" -u root -p amrs concept > database-dump.sql OpenMRS reporting tool – Typically generates Excel Spreadsheets ODBC connection to SAS

4

5

6 ** permamrsKE.SAS ** * This program creates permanent SAS datasets from AMRS SQL database * 6/28/06 BSM ******************************************************************** ; libname k 'c:\Kenya\HIV\DataMigration' ; footnote 'c:\kenya\hiv\SASCODE\permamrsKE.sas' ; libname m odbc user=jsmith password=B*24c9dl database=amrsEldoret ; data k.AMRSusers ; set m.users ; run ; data k.AMRSpatient ; set m.patient ;run; data k.AMRSpatient_identifier ; set m.patient_identifier ; data k.AMRSpatient_identifier_type ; set m.patient_identifier_type ; data k.AMRSpatient_name ; set m.person_name ; data k.AMRSencounter ; set m.encounter ; run ; data k.AMRSencounter_type ; set m.encounter_type ; data k.AMRSlocation ; set m.location ; SAS Code for Creating SAS Datasets from AMRS

7 data k.obs1a ; set m.obs(keep=obs_id person_id concept_id encounter_id order_id obs_datetime location_id obs_group_id accession_number value_group_id value_boolean value_coded value_drugvalue_datetime value_numeric value_modifier obs=3000000) ; run ; data k.obs1b ; set m.obs(keep=obs_id person_id concept_id encounter_id order_id obs_datetime location_id obs_group_id accession_number value_group_id value_boolean value_coded value_drugvalue_datetime value_numeric value_modifier firstobs=3000001 obs=6100000) ; run ; … data k.obs1y ; set m.obs(keep=obs_id person_id concept_id encounter_id order_id obs_datetime location_id obs_group_id accession_number value_group_id value_boolean value_coded value_drug value_datetime value_numeric value_modifier firstobs=14500001 obs=15000000) ; run ; data k.obs2a ; set m.obs(keep=obs_id value_text date_started date_stopped comments creator date_created voided voided_by date_voided void_reason obs=3000000) ; run ; data k.obs2b ; set m.obs(keep=obs_id value_text date_started date_stopped comments creator date_created voided voided_by date_voided void_reason firstobs=3000001 obs=6000000) ; run ; … data k.obs2y ; set m.obs(keep=obs_id value_text date_started date_stopped comments creator date_created voided voided_by date_voided void_reason firstobs=14500001 obs=15000000) ; run ; Extracting the OBS Table

8 data obs1 ; set k.obs1a k.obs1b k.obs1c k.obs1d k.obs1e k.obs1f k.obs1g k.obs1h k.obs1i k.obs1j k.obs1k k.obs1l k.obs1m k.obs1n k.obs1o k.obs1p k.obs1q k.obs1r k.obs1s k.obs1t k.obs1u k.obs1v k.obs1w k.obs1x k.obs1x; run ; proc sort data=obs1;by obs_id patient_id; data obs1a ; set obs1 ; by obs_id patient_id; if last.obs_id ; run ; data obs2 ; set k.obs2a k.obs2b k.obs2c k.obs2d k.obs2e k.obs2f k.obs2g k.obs2h k.obs2i k.obs2j k.obs2k k.obs2l k.obs2m k.obs2n k.obs2o k.obs2p k.obs2q k.obs2r k.obs2s k.obs2t k.obs2u k.obs2v k.obs2w k.obs2x k.obs2y; run ; proc sort data=obs2;by obs_id; data obs2a ; set obs2 ; by obs_id; if last.obs_id ; run ; proc sort data=obs1a;by obs_id; proc sort data=obs2a;by obs_id; data k.AMRSobs ; merge obs1a(in=o) obs2a ; by obs_id ; if o ; run ; Extracting the OBS Table (cont.)

9 Conversion to Master SAS datasets Once the data have been extracted from AMRS, we use HIVcombine.sas to merge it with data from the old ACCESS DB and create the Master SAS datasets – HIVDEMOG2.sas7bdat (stores the cross-sectional and demographic data) – HIVVISIT2.sas7bdat (stores the longitudinal data which comes mostly from the follow-up visits)

10 obs_idperso n_id encoun ter_id obs_datetimelocati on_id concept _id value_ coded value_n umeric value_datetimecreat or 200502718265302823JUN2006:00:00:00.01452821139..150 200502818265302823JUN2006:00:00:00.01460961066..150 200503118265302823JUN2006:00:00:00.0141154.1.0.150 200503218265302823JUN2006:00:00:00.0141192.0.0.150 200503318265302823JUN2006:00:00:00.01411561066..150 200503418265302823JUN2006:00:00:00.0145092.94.0.150 200503818265302823JUN2006:00:00:00.0145088.36.3.150 200503918265302823JUN2006:00:00:00.0145089.57.5.150 200504018265302823JUN2006:00:00:00.01453561204..150 200504118265302823JUN2006:00:00:00.0141248.0.0.150 200504218265302823JUN2006:00:00:00.0145096..28JUN2006:00:00:00.0150 200504318265302823JUN2006:00:00:00.0146042197..150 200504418265302823JUN2006:00:00:00.01411121107..150 2005054284115302919JUN2006:00:00:00.0252821139..152 2005055284115302919JUN2006:00:00:00.025096..04JUL2006:00:00:00.0152 2005056284115302919JUN2006:00:00:00.0256051052..152 2005057284115302919JUN2006:00:00:00.0256061065..152

11 Conversion to Master SAS datasets The process involves selecting the concept of interest (concept_id=5089) Creating a specific variable for that concept (weight) Assigning the appropriate value (value_coded, value_numeric, etc.) to the newly created variable (weight=value_numeric) Merging the records that have the same obs_datetime (or encounter_id)

12 Example of Converting a Concept into a SAS Variable Start with record from OBS table obs_idperson _id encoun ter_id obs_datetimelocatio n_id concept _id value_ coded value_numericvalue_date time creator 200503918265302823JUN2006:00:00:00.0145089.57.5.150 person_idencounter _id apptdateclinicweight 18265302823JUN2006MTRH Module 357.5 Convert the concept_id to a SAS variable

13 SAS code to create WEIGHT and merge with other longitudinal data data wght ; set amrsobscon(where=(concept_id = 5089) rename=(obsdate=apptdate)) ; wght=value_numeric ; ** delete bogus weights ** ; if wght gt 120 then delete ; keep person_id encounter_id apptdate wght ; run ; … data visit ; merge wght hght tbtx… ; by person_id apptdate ; run ;

14 Preliminary Analysis Generate frequency tables (PROC FREQ) for – All categorical data such as gender, WHO stage, yes/no/unknown questions, clinic location, etc. – Limited-response continuous data such as number of people in household Generate means/medians (PROC MEANS) for – All continuous data such as age, weight, CD4 – Ordinal data such as ‘1=strongly disagree 2=disagree 3=agree 4=strongly agree’

15 Proc Freq proc freq data=h.hivdemog2 ; title 'AMPATH Demographics' ; tables male married traveltime ; tables male*married / missing ; run ;

16 maleFrequencyPercent Cumulative Frequency Cumulative Percent 04622765.454622765.45 12440334.5570630100.00 Frequency Missing = 77 Married FrequencyPercent Cumulative Frequency Cumulative Percent 03102949.823102949.82 13125350.1862282100.00 Frequency Missing = 8425 travel time to clinic TravelTimeFrequencyPercent Cumulative Frequency Cumulative Percent < 30 minutes1539027.991539027.99 30-60 minutes1721431.303260459.29 1-2 hours1316323.944576783.23 > 2 hours922316.7754990100.00 Frequency Missing = 15717

17 Proc Means proc means data=h.hivvisit2 n mean std min max median ; title 'AMPATH Visit Data' ; var age weight cd4 ; run ;

18 AMPATH Visit Data The MEANS Procedure VariableLabelNMeanStd DevMinimumMaximumMedian age Weight cd4 weight (kg) 918651 796412 152475 32.6244700 51.9878071 375.0323837 15.2162848 19.1298253 1672.22 -4.8240931 0 0 311.2470910 120.0000000 536580.00 34.5078713 55.8000000 301.000000

19 Processing and cleaning of data for specific research question NVP Toxicity datasets


Download ppt "Using AMRS Data in Research September 15, 2008 Beverly Musick Indiana University, Division of Biostatistics."

Similar presentations


Ads by Google