Download presentation
Presentation is loading. Please wait.
Published byὈδυσσεύς Παπαγεωργίου Modified over 5 years ago
1
Basic Anthropometric data quality checks
2
Objectives Outliers and flags WHO and SMART flags. Standard Deviation
Basic Anthropometric data quality checks Objectives Outliers and flags WHO and SMART flags. Standard Deviation
3
Outliers and flags values that fall outside of an acceptable range
Basic Anthropometric data quality checks Outliers and flags values that fall outside of an acceptable range values outside of the plausible range are frequently due to poor measurement, inaccurate date of birth, or data recording errors. an important indication of data quality Flagged records can be checked and corrected, or censored The currently recommended flagging system to detect implausible z-score values in analysis of national surveys were defined in 2006 when the WHO Child Growth Standards were released, replacing the WHO/NCHS growth references. The system cut-offs were defined on the basis of what is biologically implausible, in other terms, incompatible with life. These flagging cut-offs have been challenged based on observations of living children that have z-scores beyond the currently defined implausible values
4
Flagging usually applied to
Basic Anthropometric data quality checks Flagging usually applied to HAZ WAZ WHZ BAZ (adults) Flagged records can be checked and corrected, or censored Flagging is a process of checking whether values of anthropometric indices are outside a given range The flagging process can be easily applied to other variables, even routine data not coming from surveys
5
Flagging criteria Two flagging methods are in common use WHO flags
Basic Anthropometric data quality checks Flagging criteria Two flagging methods are in common use WHO flags SMART flags Flagging is a process of checking whether values of anthropometric indices are outside a given range SMART and WHO have different flagging criteria. SMART flags are more strict, thus the exclude more data. Thus SMART flagging criteria can reduce the estimated prevalence The flagging process can be easily applied to other variables even routine data not coming from surveys. Flagging is more of a process of checking whether values of anthropometric indices are outside a given range. Values outside these flagging limits are consider implausible but note that values outside these flagging limits may be observed in children admitted into therapeutic feeding programs. So be careful when using flags for routine data
6
Flagging criteria Basic Anthropometric data quality checks
Both methods flag records in which one or more anthropometric indices are more than a certain distance either side of a reference value The WHO criteria are simple biologically plausible ranges around the reference mean of zero. If, for example, a value for WHZ is below −5 or above +5 then the record is flagged to indicate a likely problem with WHZ SMART criteria are more complicated. They require the mean value of the index to be calculated from the survey data. This is then used as the reference value and then 3 z-scores are added or subtracted to create a range. For example, a mean WHZ of −1.15 gives lower and upper SMART flagging limits of: −1.15 – 3 = −4.15 and − = +1.85
7
Flagging criteria Basic Anthropometric data quality checks
The WHO and SMART flagging criteria will flag different but overlapping sets of measurements. The SMART flagging criteria will usually flag more records than the WHO flagging criteria. This will act to reduce the estimated prevalence
8
Flagging criteria Basic Anthropometric data quality checks
Prevalence is in the “tails” of the distribution. The estimated prevalence is shown for cases defined using −3 z-scores below the reference median (i.e. zero). The red bars show the cases remaining after “outliers” to the left have been censored. The area covered by the red bars represents the estimated prevalence after flagged values have been censored. The estimated prevalence is reported below each plot as p(z < −3). Flagging is about detecting outlier values. one set of flagging criteria, either WHO or SMART, should be used at any one time. The WHO and SMART flagging criteria are designed to be applied to samples of children measured in surveys. They should not be applied to samples of severely malnourished or sick children. We recommend to use WHO flags
9
Flags Basic Anthropometric data quality checks
Present the percentage of implausible value for each indicator separately, HAZ, WHZ, and WAZ as well as by each field team. A cutoff value of 1% is recommended by WHO to define the percentage of implausible values that is indicative of poor data quality. SMART guidelines consider proportions above 7.5% to be problematic. The proportion of flagged records in a dataset should, ideally, be below 5%. Present the percentage of implausible values by other disaggregations if the percentage of implausible values is greater than 1%. While a high percentage of flagged values is a good indication of poor data quality, a low percentage does not necessarily imply adequate data quality as there can be values that are inaccurate within the WHO flag range calculate anthropometric indices from anthropometric data and then apply flagging criteria to the data.
10
Flags Basic Anthropometric data quality checks
Be careful when flagging criteria have already been applied. This is not good practice. All data should be shared. The flags should not be removed from the database but they should be excluded (censored) from any analyses Flagging has a dual role: 1. It is a data-checking tool. If you have access to data collection forms you will be often able to check records and fix data-entry errors. 2. It is a measure of data-quality. Flagged records can indicate problems with measurement, recording, data-entry, and data-checking
11
Standard deviation SD = 1 SD < 1 SD > 1
The standard deviation (SD) is a statistical measure that quantifies the amount of variation in a dataset. The smaller the SD, the closer the data points tend to be to the mean. The higher the SD, the more spread out the data points are. SDs cannot be negative, the lowest possible value for a SD is zero, which would indicate that all data points are equal to the mean (i.e. there is only one value in the entire dataset, e.g. every child has the exact same WHZ value, and thus zero variation The 2006 WHO growth standard reference sample, by definition has a standard normal distribution with mean zero and a SD of 1 for each of the anthropometric indices including WAZ, WHZ and HAZ. The growth standard is based on a sample of healthy children from six different countries (Brazil, Ghana, India, Norway, Oman, United States) with varying ethnic groups living in an environment that did not constrain optimal growth But what is the SD in malnourished population? We do not know!!!!! Thus putting limits is difficult.
12
Quick Excercise We must separate two exam results of a class of 30 students; the marks of the first exam vary between 31 % and 98 % and those of the second between 82 % and 93 %. Given this range, which standard deviation will be higher? Low standard deviation: your data is « close » to the median. High standard deviation: your data is dispersed over a large interval.
13
Quick Excercise the standard deviation will be higher for the results of the first exam.
14
Standard deviation (SD)
Anthropometric data quality checks Standard deviation (SD) The higher the SD, the more likely poor data quality is Very difficult to put acceptable ranges SDs are typically wider for HAZ SDs for HAZ are largest for younger children No difference between girls and boys The standard deviation is sometimes considered to be useful measure of data quality when applied to z-scores. The 1995 WHO Technical Report on Anthropometry suggested a set of SD ranges, outside of which data quality could be a concern, but these cut-offs need to be revised so that they can better reflect nationally representative surveys and the WHO growth standard which is currently used. SDs are typically wider for HAZ than they are for WAZ or WHZ. A portion of this likely due to measurement error as height is more difficult to accurately measure than weight with currently available equipment and also obtaining accurate date of birth can be an issue in some populations. SDs for HAZ are largest for younger children and become tighter as the age of children increases. A component of the larger SD is due to measurement error as length is more difficult to measure than height SDs should not substantially different between girls and boys, although there may be slight biological variation.
15
Standard deviation SD = 1 SD > 1
If the SD is >1, the prevalence calculated with the current SD is higher than the prevalence calculated with SD=1. The 1995 WHO Technical Report on Anthropometry recommended using SD as a standard of quality with acceptable ranges of 1.1 to 1.3 for HAZ, 1.0 to 1.2 for WAZ and 0.85 to 1.1 for WHZ. SMART state that the acceptable range for the standard deviation of the weight-for-height z-scores (whz) is 0.8 to 1.2. But there is no consensus. More important than being out of a certain range is understand and identify the causes of large SDs because it can be due to poor data and produce inflated prevalence estimates but it may also be due to sampling from a mixed population rather than due to poor data quality. Systematic measurement errors tend to increase the SD but do not impact the mean Z score. Using mean Z score instead of prevalence of malnutrition is a more reliable indicator in case of frequent systematic errors. BUT it is more difficult to communicate on Z score than prevalence.
16
Standard Deviation Calculated by most softwares
Apply only to cleaned data from which erroneous data and flagged records have been censored. Where n= the number of data points, Y = the mean of Yi and Yi is each of the values in the dataset. It is recommended to present the SD for each indicator separately, HAZ, WHZ and WAZ. as well as for different strata. explanations should be explored and included in the survey report. SD for anthropometric indices in any given survey can also be compared to those from other surveys meant to be representative of the same population in and around the same time period Flagging and identifying the causes of large SDs is important for data quality assessment. Therefore, if the SD is artificially inflated as a result of poor quality data, the prevalence estimates are likely to be overestimated. Definitively quantifying how much of the dispersion in z-scores can be attributed to heterogeneity in relation to environments which do not support optimal growth and how much to measurement error is a challenging research question
17
Conclussions For SD further investigations are needed to
Anthropometric data quality of our surveys Conclussions For SD further investigations are needed to (i) develop guidance on how to tease out the relative contribution of measurement error from expected population-associated spread for any given survey; (ii) to ascertain a cut off at which the SD might be more conclusively related to data quality for each anthropometric index. Other approaches still need more testing January 2019 Addis Ababa
18
Excercise 4 Divide in 4groups
The file ex04.csv is a comma-separated-value (CSV) file containing anthropometric data from a recent SMART survey in Sudan. Calculate WHO and SMART flags Team B: present on WHO flags Team C: present on SMART flags
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.