New Guidelines for Data Result Suppression of Small Cell Numbers in IBIS-Query Output Public Health Informatics Brown Bag July 22, 2009 Kathryn Marti,

Slides:



Advertisements
Similar presentations
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia.
Advertisements

Healthy People 2020 Update Utah Department of Health Office of Public Health Informatics Brown Bag Presentation March 25, 2009 Kathryn Marti, Director.
Chapter 10: Hypothesis Testing
1 Case Study 1: How to Deal with Estimates with Low Reliability 2009 Population Association of America ACS Workshop April 29, 2009.
EXCEL’ing at : June 19, 2003 Ruth Sanderson, Epidemiologist.
Statistical Concepts (part III) Concepts to cover or review today: –Population parameter –Sample statistics –Mean –Standard deviation –Coefficient of variation.
Topics: Inferential Statistics
CHAPTER 2 Basic Descriptive Statistics: Percentages, Ratios and rates, Tables, Charts and Graphs.
Lecture Slides Elementary Statistics Twelfth Edition
7-2 Estimating a Population Proportion
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability.
BPS - 3rd Ed. Chapter 131 Confidence intervals: the basics.
Basic Statistical Concepts Donald E. Mercante, Ph.D. Biostatistics School of Public Health L S U - H S C.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.2 Estimating a Population Proportion Objective Find the confidence.
HaDPop Measuring Disease and Exposure in Populations (MD) &
Understanding study designs through examples Manish Chaudhary MPH (BPKIHS)
Inferential Statistics
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
Are exposures associated with disease?
Chapter 7 Confidence Intervals and Sample Sizes
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
1 CHAPTER 7 Homework:5,7,9,11,17,22,23,25,29,33,37,41,45,51, 59,65,77,79 : The U.S. Bureau of Census publishes annual price figures for new mobile homes.
Deanna E. White, Adam Stevens, John Barbaro, Kristy McGill and Lynne Russell.
Wisconsin Department of Health Services HIV/AIDS Surveillance Annual Review New diagnoses, prevalent cases, and deaths through December 31, 2013 April.
Measuring disease and death frequency
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
Statistics: Concepts and Controversies What Is a Confidence Interval?
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Confidentiality Issues with “Small Cell” Data Michael C. Samuel, DrPH STD Control Branch California Department of Public Health 2008 National STD Prevention.
Chapter 6: Random Errors in Chemical Analysis CHE 321: Quantitative Chemical Analysis Dr. Jerome Williams, Ph.D. Saint Leo University.
1 Things That May Affect Estimates from the American Community Survey.
Utah Department of Health 1 1 Identifying Peer Areas for Community Health Collaboration and Data Smoothing Brian Paoli Utah Department of Health 6/6/2007.
IBIS-Admin New Mexico’s Web-based, Public Health Indicator, Content Management System.
Using ACS and Census 2010 in Communities and Neighborhoods: Guidelines and Tools POPULATION REFERENCE BUREAU | PRESENTATION BY MARK MATHER.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
Categorical data 1 Single proportion and comparison of 2 proportions دکتر سید ابراهیم جباری فر( (Dr. jabarifar تاریخ : 1388 / 2010 دانشیار دانشگاه علوم.
IBISAdmin Utah’s Web-based Public Health Indicator Content Management System.
Things that May Affect the Estimates from the American Community Survey Updated February 2013.
Is for Epi Epidemiology basics for non-epidemiologists.
Rates, Ratios and Proportions and Measures of Disease Frequency
Crude Rates and Standardisation Standardisation: used widely when making comparisons of rates between population groups and over time (ie. Number of health.
Today - Messages Additional shared lab hours in A-269 –M, W, F 2:30-4:25 –T, Th 4:00-5:15 First priority is for PH5452. No TA or instructor Handouts –
BPS - 3rd Ed. Chapter 131 Confidence Intervals: The Basics.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Measures of Disease Frequency
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
10 May Understanding diagnostic tests Evan Sergeant AusVet Animal Health Services.
Lecture 4 Ways to get data into SAS Some practice programming
Statistics Canada Citizenship and Immigration Canada Methodological issues.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
366_8. Estimation: Chapter 8 Suppose we observe something in a random sample how confident are we in saying our observation is an accurate reflection.
CALCULATE THE GROWTH RATE: Birth Rate = 10 Individuals Immigration = 20 Individuals Death Rate = 15 Individuals Emigration = 5 Individuals Growth Rate.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Asthma in Utah County Presented by: Celeste Beck, MPH Utah Asthma Program Epidemiologist.
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Comparing Two Proportions Chapter 21. In a two-sample problem, we want to compare two populations or the responses to two treatments based on two independent.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Sample size calculation
Confidence Interval for the Difference of Two Proportions
Measures of Disease Occurrence
Measures of risk and association
Basic measurements in Demography
Interpreting Epidemiologic Results.
Lecture Slides Elementary Statistics Twelfth Edition
Presentation transcript:

New Guidelines for Data Result Suppression of Small Cell Numbers in IBIS-Query Output Public Health Informatics Brown Bag July 22, 2009 Kathryn Marti, Director Office of Public Health Assessment

Outline Introduction Background Work Group Important Concepts Methodology The Guidelines Next steps

Introduction Why not report data? –Protect privacy –Ensure reliable data Why make data publicly available? What reports? –Published reports –Online static tables –Web-based data query system output

Background Current IBIS-Q suppression rules Enhancements to IBIS-Q BRFSS Increased likelihood of small cell sizes Grew to department-wide effort Develop UDOH-wide policy HIPAA and more Sensitivity for vulnerable populations

Data Suppression Decision Rules Work Group Charter Provide guidelines to UDOH programs Program into IBIS-Q Work Group Products: –Decision Rules –Decision Tree –Justification of the rules

Important Concepts Confidence Interval Coefficient of Variation Power to detect a difference Context/Use of the estimate

Coefficient of Variation CV=[100 x (SE(R)/R)]% Where: –SE = the standard error of the estimate –R = the estimated rate Unitless measure Usually expressed as a percentage

Coefficient of Variation (cont’) If the estimated percentage is greater than 50%, the CV is calculated by dividing the standard error of the estimate (SE(R)) by one minus the estimate (1-R). CV=[100 x (SE(R)/(1-R))]%

Coefficient of Variation for Percentages Example #1 The crude estimated percentage of Utah adults aged 85 and older who were current cigarette smokers in 2007 was 7.65% with a standard error of 3.81%. CV = 100 x (0.0381/0.0765) = 49.8%

Coefficient of Variation for Percentages (cont’) Example #2 The crude estimated percentage of Utah adults aged 85 and older who were non- smokers in 2007 was 92.35% with a standard error of 3.81%. CV = 100 x (0.0381/( )) = 100 x (0.0381/0.0765) = 49.8%

Coefficient of Variation for Count Data D = # of deaths ~ Poisson P = # in population where deaths occurred R = rate per 100,000 = (D/P) *100,000 Var(R) = Var [(D/P) x 100,000] = (100,000/P) 2 Var(D) = (100,000/P) x ((100,000*D)/P) = (100,000/P) x R CV= 100 x [SQRT(Var(R))]/R = 100 x [SQRT((100,000/P)*R)]/R = 100 x SQRT[100,000/(P*R)]

Coefficient of Variation for Count Data (cont’) Example #1 D = 388 Alzheimer’s disease deaths in Utah in 2006 P = UT Population in 2006 was 2,615,129 R = (388/2,615,129) x 100,000 = per 100,000 CV = 100 x SQRT[100,000/(P*R)] CV = 100 x SQRT[100,000/(2,615,129 x 14.84)] = 5.08%

Coefficient of Variation for Count Data (cont’) Example #2 D = 5 Alzheimer’s disease deaths in Utah adults between the ages in 2006 P = UT Population in 2006 was 201,340 R = (5/201,340) x 100,000 = 2.48 per 100,000 CV = 100 x SQRT[100,000/(P*R)] CV = 100 x SQRT[100,000/(201,340 x 2.48)] = 44.75%

Methodology Literature review Compared suppression rules using BRFSS small area, low prevalence data. –< 5 observations in numerator or 30 in denominator –CV greater than 30% –CV greater than 50% –the CI length larger than the estimated percent –1/2 the CI length >10% and denominator less than 50

Guidelines Minimum Criteria For Reporting Both Survey Data and Population Event Data:  CV < 50%  If 30% < CV < 50% an asterisk should be included with a footnote that says: *Use caution in interpreting, the estimate has a relative standard error greater than 30% and does not meet UDOH standards for reliability.

Guidelines Strict Criteria For Reporting Survey Data:  > 10 cases in the numerator  AND a CV < 30% For Reporting Population Event Data:  > 20 cases in the numerator and >100 persons in the population AND a CV < 30%

TYPE OF DATA KIND OF POPULATION SURVEY DATA COUNT DATA GENERAL (MINIMUM CRITERIA) VULNERABLE (STRICT CRITERIA) NUMERATOR RATE <10 SUPPRESS OR AGGREGATE ≥10 RATE >50% <50% CV=SE/RATE CV=SE/(1-RATE) >50% <50% CV=SE/RATE CV=SE/(1-RATE) REPORT REPORT WITH WARNING SUPPRESS OR AGGREGATE REPORT REPORT WITH WARNING SUPPRESS OR AGGREGATE REPORT SUPPRESS OR AGGREGATE SUPPRESS OR AGGREGATE 30<CV<50% >50% 30<CV<50% <30% >30% <30% >30% <30% >50%

TYPE OF DATA KIND OF POPULATION COUNT DATA SURVEY DATA GENERAL (MINIMUM CRITERIA) VULNERABLE (STRICT CRITERIA) CELL SIZE <20/100 SUPPRESS OR AGGREGATE REPORT REPORT WITH WARNING SUPPRESS OR AGGREGATE 30<CV<50% <30% >50% CV= 100 x 100,000. PR √( ) REPORT SUPPRESS OR AGGREGATE <30% >30% CV=100 x 100,000. PR √( ) ≥20/100

Next Steps Program into IBIS-Q and SAS CGI Determine how to display output and notes –New Mexico example –Utah Test server Get OK from Data Stewards –Minimum criteria vs. Strict criteria

Next Steps (con’t) Suggested text for 30% < CV < 50% : *Use caution in interpreting, the estimate has a relative standard error greater than 30% and does not meet UDOH standards for reliability.