Data Processing & Data Quality
A virtuous cycle The implicit assumptions underlying information systems are twofold: first, that good data, once available, will be transformed into useful information which, in turn, will influence decisions; second, that such information-based decisions will lead to a more effective and appropriate use of scarce resources through better procedures, programmes, and policies, the execution of which will lead to a new set of data which will then stimulate further decisions, and so forth in a spiral fashion. (Sauerborn 2000 in Lippeveld et.al. Design and implementations of health information systems)
How do we process it? How do we present it? How do we use it? Reliable Information Information Cycle What do we collect? Stages Tools Outputs data sources & tools Timely Quality data Data quality checks & analysis Information
Ensuring data accuracy Once data has been collected, it should be checked for any inaccuracies and obvious errors. Ideally this should be done as close to the point of data collection as possible. – Identify cause – Prevent future errors Remember Johan’s little investigation
Why checking data is vital? Use of inaccurate data leads to – Wrong priorities (focus on the wrong data) – Wrong decisions (not applying the right actions) – Garbage in = garbage out Producing data is EXPENSIVE – Waste of resources and time to collect poor data
Data, in order to be useful, should be: RELIABLE: correct, complete, consistent TIMELY:fixed deadlines for reporting AVAILABLE : who reports to whom? feedback mechanisms ACTIONABLE: no action = throw data away COMPARABLE: same numerator and denominator definitions used by all (e.g. geography vs org. unit function)
Complete data? Geography: submission by all (most) reporting facilities Time: can you do analysis over time? Consistency? Does your services cover the full population? Many indicators depend on population figures as denominators
Correct data? Are we even collecting the right data? The data seems sensible/plausible? The same definition applied uniformly Legible handwriting Are there any preferential end digits used?
PREFERENTIAL END-DIGITS JANFEBMARCHAPRILMAYJUNEJULY
Consistent data? Data in the similar range as this time last year or similar to comparable reporting organization units No large gaps or missing data No multiplicity of data (same data from multiple sources –which one to trust?)
Timely data? Some data needs to be acted upon immediately Late reports weaken the potential for comparison, and action can be too late, but still useful for documenting trends
Accuracy enhancing principles Capacity building through training (90% of HISP activities) User-friendly collection/collation tools Feedback on data errors (but not only!) Feedback of analysed Information Local Use of information
Controlling quality with DHIS2 Maximum / minimum values Validation rules Validation Checks/Reminders Completeness and timeliness reports (input for a league table?) - Will be covered in lab session -
Good data quality 10 steps to achieve it
1. Small, Essential Data Set – EDS 2. Use of data locally by the collectors 3. Clear definitions - standards 4. Careful collection and collation of data – good tools 5. Sharing of information 6. Regular feedback 7. Supportive supervision - at all levels 8. Ongoing capacity building through training and support 9. Regular discussion of information at facility team meetings 10. Monitoring & Rewarding good information (League Table)
or else…
limited capacity to manage or analyse data Using evidence not perceived as a winning strategy A vicious cycle Data not trusted Weak demand Weak HIS Poor data quality Limited investment in HIS Decisions not evidence-based Donors get their own Fragmentation