U N E C E 2 0 0 51 Software Tools for Statistical Disclosure Control by Complementary Cell Suppression – Reality Check Ramesh A. Dandekar U. S. Department.

Slides:



Advertisements
Similar presentations
BTS Confidentiality Seminar Series June 11, 2003 FCSM/CDAC Disclosure Limiting Auditing Software: DAS Mark A. Schipper Ruey-Pyng Lu Energy Information.
Advertisements

Solving LP Models Improving Search Special Form of Improving Search
Linear Programming Graphical Solution Procedure. Two Variable Linear Programs When a linear programming model consists of only two variables, a graphical.
Solving Linear Programming Problems: The Simplex Method
Dragan Jovicic Harvinder Singh
Linear Programming: Simplex Method and Sensitivity Analysis
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Easy Optimization Problems, Relaxation, Local Processing for a small subset of variables.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
THE MEANING OF STATISTICAL SIGNIFICANCE: STANDARD ERRORS AND CONFIDENCE INTERVALS.
1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.
Chapter 10: Iterative Improvement
Evaluating Hypotheses
Computational Methods for Management and Economics Carla Gomes
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
Optimization Linear Programming and Simplex Method
Solver & Optimization Problems n An optimization problem is a problem in which we wish to determine the best values for decision variables that will maximize.
Optimization of Linear Problems: Linear Programming (LP) © 2011 Daniel Kirschen and University of Washington 1.
1 Chapter 4: Variability. 2 Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure.
Solving Linear Programming Problems Using Excel Ken S. Li Southeastern Louisiana University.
Chapter 7 Inferences Regarding Population Variances.
Linear Programming.
Stevenson and Ozgur First Edition Introduction to Management Science with Spreadsheets McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
Paper review of ENGG*6140 Optimization Techniques Paper Review -- Interior-Point Methods for LP for LP Yanrong Hu Ce Yu Mar. 11, 2003.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Discussion of “ Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis” Nancy J. Kirkendall Energy Information Administration.
1 Chapter 7 Linear Programming. 2 Linear Programming (LP) Problems Both objective function and constraints are linear. Solutions are highly structured.
1 1 Slide © 2005 Thomson/South-Western Chapter 2 Introduction to Linear Programming n Linear Programming Problem n Problem Formulation n A Maximization.
Yafeng Yin 1, Lei Xie 1, Jie Wu 2, Athanasios V. Vasilakos 3, Sanglu Lu 1 1 State Key Laboratory for Novel Software Technology, Nanjing University, China.
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Monté Carlo Simulation  Understand the concept of Monté Carlo Simulation  Learn how to use Monté Carlo Simulation to make good decisions  Learn how.
1 Max 8X 1 + 5X 2 (Weekly profit) subject to 2X 1 + 1X 2  1000 (Plastic) 3X 1 + 4X 2  2400 (Production Time) X 1 + X 2  700 (Total production) X 1.
Mechanical Engineering Department 1 سورة النحل (78)
1.7 Linear Inequalities.  With an inequality, you are finding all values of x for which the inequality is true.  Such values are solutions and are said.
The Goldilocks Problem Tudor Hulubei Eugene C. Freuder Department of Computer Science University of New Hampshire Sponsor: Oracle.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Foundation of the Simplex Method.  Constraints Boundary Equations  Graphical approach is very limited based on number of variables. The simplex method.
Integer Bounds on Suppressed Cells in Multi-Way Tables Stephen F. Roehrig Carnegie Mellon University For.
CHAPTER 2: Basic Summary Statistics
IE 312 Review 1. The Process 2 Problem Model Conclusions Problem Formulation Analysis.
Common Intersection of Half-Planes in R 2 2 PROBLEM (Common Intersection of half- planes in R 2 ) Given n half-planes H 1, H 2,..., H n in R 2 compute.
1 Optimization Linear Programming and Simplex Method.
The Simplex Method. and Maximize Subject to From a geometric viewpoint : CPF solutions (Corner-Point Feasible) : Corner-point infeasible solutions 0.
1 2 Linear Programming Chapter 3 3 Chapter Objectives –Requirements for a linear programming model. –Graphical representation of linear models. –Linear.
FREQUENCY DISTRIBUTION
Distributions.
Linear Programming for Solving the DSS Problems
EMGT 6412/MATH 6665 Mathematical Programming Spring 2016
Sampling Distributions
Statistical Modelling
Solver & Optimization Problems
Chapter 9: Inferences Involving One Population
10CS661 OPERATION RESEARCH Engineered for Tomorrow.
Constrained Optimization
Graphical Sensitivity Analysis and Computer Solution
ENGM 631 Optimization Ch. 4: Solving Linear Programs: The Simplex Method.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Graphical Solution Procedure
10.2 Variance Math Models.
BT8118 – Adv. Topics in Systems Biology
CHAPTER 2: Basic Summary Statistics
Using Clustering to Make Prediction Intervals For Neural Networks
Presentation transcript:

U N E C E Software Tools for Statistical Disclosure Control by Complementary Cell Suppression – Reality Check Ramesh A. Dandekar U. S. Department of Energy Washington DC Software Tools for Statistical Disclosure Control by Complementary Cell Suppression – Reality Check Ramesh A. Dandekar U. S. Department of Energy Washington DC

U N E C E Equality Constraints Associated With Suppression Pattern A x = b Used To Create Solution Space Around Unknown or Suppressed Table Values

U N E C E Schematic N-D Solution Space Surrounding True Table Values Solution Space Defined by Lower and Upper Bounds on Suppressed Table Cells Typically Multiple Solutions Satisfying Ax=b Exist Safe distance away from edges

U N E C E Two families of solution techniques are in wide use today. Both visit a progressively improving series of trial solutions, until a solution is reached that satisfies the conditions for an optimum. Who needs it?

U N E C E Simplex methods, introduced by Dantzig about 50 years ago, visit "basic" solutions computed by fixing enough of the variables at their bounds to reduce the constraints Ax = b to a square system, which can be solved for unique values of the remaining variables. Basic solutions represent extreme boundary points of the feasible region defined by Ax = b, x >= 0, and the simplex method can be viewed as moving from one such point to another along the edges of the boundary.Dantzig

U N E C E Simplex Solutions Tend to Cluster around Edges of the Solution Space Neutral or Null Objective Function

U N E C E Barrier or interior-point methods, by contrast, visit points within the interior of the feasible region. Creates Real Big Problem for SDL Task

U N E C E Interior Point Solutions Tend to Cluster Towards the Center of the Solution Space Neutral or Null Objective Function

U N E C E Supporting Example From Prof. Jordi Castro min 0 st. x1 + x2 + x3 = 3 x1, x2, x3 > = 0 Interior point methods will provide the solution x1 = x2 = x3 = 1 Simplex methods will provide some xi = 3, the remaining two xj = 0.

U N E C E Table available from

U N E C E

U N E C E Statistical Estimates Additive Point EstimatesAdditive Point Estimates AveragesAverages Frequency DistributionsFrequency Distributions Additive Using AveragesAdditive Using Averages Additive Using Frequency DistributionsAdditive Using Frequency Distributions Using CTA Principles

U N E C E

U N E C E

U N E C E

U N E C E Peak True LowHigh Interval = (high-low)/10. Distance = ABS ( Peak interval – True interval )

U N E C E Cell: 1 True Value: 714. Range: 409. Dif 0 Within: LP audit based range = 409 Peak Density Range TRUE VALUE AND FREQUENCY DISTRIBUTION OF FIRST SUPPRESSED CELL

U N E C E

U N E C E Conclusions and Suggestions Avoid Tighter BoundsAvoid Tighter Bounds Over Protection (not same as over suppression) is not undesired PropertyOver Protection (not same as over suppression) is not undesired Property Use Larger Cells as ComplementsUse Larger Cells as Complements Use of Cost Function 1/( call value) or Log(cell value)/value is preferredUse of Cost Function 1/( call value) or Log(cell value)/value is preferred Synthetic Tabular Data a.k.a. Controlled Tabular Adjustments might be worth Looking in toSynthetic Tabular Data a.k.a. Controlled Tabular Adjustments might be worth Looking in to

U N E C E ADDITIONAL INFORMATION FROM THANK YOU! ADDITIONAL INFORMATION FROM