Presentation is loading. Please wait.

Presentation is loading. Please wait.

HIS Topic 3: Data access and sharing

Similar presentations


Presentation on theme: "HIS Topic 3: Data access and sharing"— Presentation transcript:

1 HIS Topic 3: Data access and sharing
2017 PEPFAR Data and Systems Applied Learning Summit HIS Topic 3: Data access and sharing September 19, 2017

2 Welcome & Introductions

3 Agenda Estimated Time Topic 5 minutes 1. Welcome 10 minutes
2. PEPFAR/MOH data alignment; indicator mapping 10 minutes 3. Standardized MER Analytic Datasets 4. WHO dashboards 5. Facilitated country team discussion 30 minutes 5. Open discussion and conclusion

4 Standardized MER Analytic Datasets(Fact View)

5 Background To meet reporting and performance expectations, there is a need to access large (>100MB) datasets. Ideally, this would occur within a data architecture that allows seamless access, analysis, and distribution of data. Several issues made it difficult for ICPI or agencies to perform analysis without extracting data out of the existing data systems.

6 Primary Options Description Benefits Potential limitations Panorama
Description Benefits Potential limitations Panorama Web-based platform for quarterly data reviews Standardized views Calculated indicators available GREAT first place to look at MER data Limited ability to do customized analyses Available 1 week after period closes DATIM: Pivot table, visualizer Tool for exporting detailed, site-level data sets for further manipulation Access to pre-approved data (starting Q4) Data manipulation possible Can create favorites and share data views Requires knowledge of how to combine & manipulate data Calculated indicators not included ICPI FACT VIEW Datasets Data text files that can be imported into statistical software or excel Fully customizable analyses made possible Calculated indicators & features to improve analytics Data in a standardized format; easy to use in Excel pivot tables Requires knowledge of how to combine & manipulate data Available 1 week after period closes

7 Before Fact View Standardized Datasets
DATIM Pivot tables Governed by user permissions. Extracting data can be challenging – Aw Snap! Requires knowledge of how to combine & manipulate data Calculated indicators not included Data structure can be challenging to develop routine analysis.

8 Source: PEPFAR Data Hub

9 Purpose of Fact View Datasets
Provide standardized datasets for analytics Avoid stakeholders looking at different data i manual labor required to prep data Structured to facilitate analysis in Excel and other statistical packages (R, SAS, STATA) Always delivered in the same format for ease of refreshing visualizations & analyses One primary purpose of producing these datasets is to reduce the amount of manual labor required to prepare data for use in agency and interagency analyses. Rather than trying to go in and pull a bunch of different things out of Genie, you can just download one dataset that has what you need in it. So we’ve structured the data to facilitate analyses. But if you have questions about how to best utilize the data given the structure, you should feel free to reach out to any of the members of the DAQ and we’re happy to help. Our Fact View Datasets are also always delivered in the same format. So if you have recurring visualizations or analyses you need to create, it’s easy to plop a new dataset into your tool and refresh your findings after each new frozen instance of the database.

10 5 downloadable, pre-structured datasets containing MER data:
Fact View Datasets 5 downloadable, pre-structured datasets containing MER data: Implementing Mechanism (IM) Priority Sub National Units (PSNU) PSNU x IM (one per OU) Site x IM (one per OU) NAT & SUBNAT The IM dataset and the PSNU dataset are global datasets so they contain data from all countries. We do produce a global PSNU x IM dataset that is primarily for use by ICPI Analysts to create some of the Excel tools for specific program areas. When we get down to the PSNU x IM and Site x IM levels we produce individual datasets for each OU

11 When are Fact View Datasets Released?
Twice per quarter (aligned with Panorama refreshes) After initial data entry for a Quarter closes After data cleaning/deduplication closes FV datasets released ~1 week after a data entry/ dedup closes We plan to release fact view datasets twice per quarter, in line with the pepfar reporting calendar. Once after the initial data entry for a quarter closes and then again after the cleaning/deduplication period closes. In reality, sometimes OGAC reopens data submission a 3rd time in a quarter to try to fix major issues or allow for additional cleaning for one or more OU’s data. If that occurs and Panorama is also refreshed then new Fact View Datasets will likely be released too. It takes us about a week to pull the data and check for consistency across the fact views datasets, panorama, and final.datim. Therefore Panorama and the Fact View Datasets generally both are released about a week (or a little less) after the data entry or cleaning/deduplication period closes so keep that in mind when you’re scheduling meetings to talk about new data.

12 What is in the Fact View Datasets?
Where? Who? When? orgUnitUID Region RegionUID OperatingUnit OperatingUnitUID CountryName SNU1 SNU1uid PSNU PSNUuid FY16SNUPrioritization FY17SNUPrioritization typeMilitary CommunityUID Community FY16CommunityPrioritization FY17CommunityPrioritization TypeCommunity FacilityUID Facility FY16FacilityPrioritization FY17FacilityPrioritization TypeFacility MechanismUID PrimePartner FundingAgency MechanismID ImplementingMechanismName FY2015Q2 FY2015Q3 FY2015Q4 FY2015APR FY2016_TARGETS FY2016Q1 FY2016Q2 FY2016Q3 FY2016Q4 FY2016APR FY2017_TARGETS FY2017Q1 FY2017Q2 What? dataElementUID Indicator numeratorDenom indicatorType disaggregate standardizedDisaggregate categoryOptionComobUID categoryOptionComboName Age Sex resultStatus otherDisaggregate coarseDisaggregate modality tieredSiteCounts typeTieredSupport isMCAD We produce 5 different levels of datasets: an IM level, a PSNU level, a PSNU x IM level, and a Site x IM level. Each type of dataset contains different information. We’ve grouped each variable according to which dataset(s) contain that variable. Green – All datasets Blue – PSNU, PSNU x IM, and Site x IM (NOT IM dataset) Gray – IM, PSNU x IM, and Site x IM (NOT PSNU dataset) Pink – Site x IM dataset only

13 Fact View Datasets Contain
A focused organizational hierarchy Contains Region, OU, Country, PSNU, Implementing Mechanism, Community, Facility so you can aggregate/analyze at multiple levels Prioritization values for FY16 & FY17 (e.g., sustained, scale-up aggressive) FY15, FY16 & FY17 results (including de-dups) Pre-calculated APR totals based on MER guidance FY16 (COP15) & FY17 (COP16) targets Indicator type (TA/DSD) Tiered support (# of site visits)* Disaggregates Age, Sex, Modality, Result Status, Other Disaggs Standardized Disaggregate (helps with HTS analysis) Most Complete Age Disaggregate (MCAD) values for HTS_TST, TX_CURR and TX_NEW Calculated Indicators *Available only in Site x IM datasets

14 Need for calculated indicators

15 Calculated Indicators
Calculated indicators are created to make analysis of complicated PEPFAR indicators simpler Automatically aggregates data by specific grouping (e.g., positive or negative) within a disaggregate. Each calculated indicator is listed under the “Indicator” column as if it is it’s a regular MER indicator.

16 Summed Annual APR Totals Snapshot Annual APR Totals HTS_TST TX_CURR
APR Calculated Values APR totals calculated for all indicators according to MER guidance Summed Annual APR Totals Snapshot Annual APR Totals HTS_TST TX_CURR APR = Q1 + Q2 + Q3 + Q4 APR = Q4 Another great feature of the Fact View Datasets is that we have calculated APR year end totals for every indicator according to the 2017 MER Guidance. As appropriate, when an indicator’s APR is calculated by adding together the data from each quarter, we have done that. Where the APR value is a snapshot at the end of the year, like for Treatment Current, we have set APR = Q4 value.

17 MCAD – calculated disaggregate
Fine Coarse 1. When there is a total numerator, MCAD picks either Fine OR Coarse depending on which is closer to the Total Numerator value, and privileges Fine if Fine = Coarse. If abs(N – F) <= abs(N – C) then F If abs(N – F) > abs(N – C) then C. 2. If there is no total numerator, MCAD picks either Fine OR Coarse depending on which is the higher value, and privileges Fine if Fine = Coarse. If abs(F) >= abs(C) then F, If abs(F) < abs(C) then C. In the case of dedups, the absolute value is taken for comparison abs(Fine) vs abs(Coarse), compared against abs(Total Numerator). MCAD is a calculated disaggregate used for HTS_TST, TX_NEW, and TX_CURR These are the indicators w/ both Fine & Coarse age-sex disaggregates MCAD selects the most complete disaggregate (either Fine or Coarse) that was entered from each Site-IM DSD/TA level & combines them into a single, new calculated disaggregate <15/15+ and Male/Female/Unknown Sex Disaggregate column: contains “MostCompleteAgeDisagg” Ex. VCT/MostCompleteAgeDisagg For further details, read the User’s Guide and Data Dictionary

18 Standardized Disaggregate
This table can be found in the User’s Guide and Data Dictionary. Of note, Data from PMTCT ANC and VMMC Age/Result disaggregates were distributed across the Standardized Disaggregates according to the appropriate age bands.

19 SO MUCH DATA! Some datasets are too big to open in Excel
PSNU (global), some PSNU x IM, and some Site x IM datasets exceed Excel’s limit for # of rows For largest files, statistical software (R/SAS/STATA) is needed to analyze data or reduce file size DAQ posted a guide containing R/SAS/STATA code to allow users to filter by OU or indicator in order to reduce the size of datasets (so the data becomes usable in Excel) ICPI has posted a Word document called “Code for Manipulating ICPI Fact View Datasets in Statistical Packages” on pepfar.net which provides instructions on how to import and trim the file size of Fact View Datasets in various statistical packages (R, SAS, and STATA). Once the size of the files are reduced (by eliminating extraneous OUs or Indicators), the datasets can then be exported and then opened in Excel. If you do not have access to stats software in country, contact your SI Advisor or your agency SI POCs for assistance in reducing file size – if you provide information about what indicators, disaggs, etc you want someone at HQ should be able to help get you a file to use in Excel. ICPI has also posted separate PSNU files for each OU in order to facilitate their use by country teams. These OU-specific PSNU files are uploaded on both Panorama and PEPFAR.NET and can be opened directly in Excel.

20 Summary of supplemental materials
User’s Guide & Data Dictionary How to access & import into Excel properly Known nuances (general reasons why Fact Views & Panorama don’t always match DATIM exactly) Column by column explanation of data Minimal changes made each release Release Notes New passwords for datasets are created each release Written summary of validation/consistency checks for current release (actual differences between Fact View, Panorama & FINAL.DATIM) Consistency Check “Cheat Sheet” DAQ posted a guide containing R/SAS/STATA code to allow users to filter by OU or indicator in order to reduce the size of datasets (so the data becomes usable in Excel) ICPI Fact View Analytic Datasets, supporting documentation, and trainings can be downloaded on PEPFAR.net or Panorama

21 Site x IM Dataset Particularities
The Site x IM dataset contains the most granular data of all the ICPI Fact View datasets Due to the level of detail, security measures have been implemented to protect sensitive data It is essential to understand these measures and the limitations of the dataset BEFORE proceeding with analysis

22 Finally… Where are the datasets???

23 Accessing via PEPFAR.net
Navigate to the ICPI Fact View Datasets: Home > HQ > Interagency Collaborative for Program Improvement (ICPI) > Shared Documents > ICPI Data Store > MER > “ICPI Fact View – August ” (or most recent date) Using the dropdown arrow next to the dataset you’re interested in (e.g., “ICPI_Fact_View_PSNU_IM_ _v1_1_[OU name]”), select download a copy. Download & open the “ICPI_Fact_View_Release_Notes_ ” This document contains a password for the zip file

24 Accessing via Panorama
Login to Panorama, navigate to the home page, and click “Download Files/Links” in the bottom left of the page.

25 ICPI Tools using Fact View Datasets
Datapack TX Treatment Dashboard TX Net New Site Level DREAMS Dashboard Gender HTS Tool HTS & Linkages – CHIPS Tool HTS & Linkages – HITS Tool KP Dashboard OVC Dashboard PPR VMMC Quarterly Tool Data Review Tool (DRT)

26 Reminder About GENIE Exports
DATIM team developing a new GENIE export structured like Site by IM Fact View Datasets Anticipated Q4 rollout Will only contain results from current period (no targets or extra period) Will include calculated indicators, MCAD & standardized disaggs ~24hr delay between when data entered & when available Special Site x IM considerations (re: military & KP data) will not apply to Genie exports. Will appear as reflected in DATIM The DATIM team is currently developing a new GENIE export option that will allow you to export datasets containing unapproved data – these new exports will have the exact same structure as our Site x IM Fact View Datasets. They are working to make sure they try to fix the current size limitations on Genie pulls so that hopefully you can export an entire dataset in one go (rather than large countries having to do 40 different Genie exports and then stitch them together). The key difference is that unlike our Fact View Datasets, the Genie exports will only have data from the current reporting period (so no targets or results from other periods yet). But the exports will include all of our calculated indicators, the MCAD, and the standardized disaggregate. We should point out that the exports won’t be 100% real time, there could be a 24 hour delay between when a partner enters data and when it is available in the Genie export. That’s because the data isn’t coming directly out of DATIM, it is being pulled from the PEPFAR Data Hub so that calculations and manipulations can be done to create the calculated indicators, MCAD, and standardized disaggregate.

27 Thank You!


Download ppt "HIS Topic 3: Data access and sharing"

Similar presentations


Ads by Google