U N E C E Software Tools for Statistical Disclosure Control by Complementary Cell Suppression – Reality Check Ramesh A. Dandekar U. S. Department of Energy Washington DC Software Tools for Statistical Disclosure Control by Complementary Cell Suppression – Reality Check Ramesh A. Dandekar U. S. Department of Energy Washington DC
U N E C E Equality Constraints Associated With Suppression Pattern A x = b Used To Create Solution Space Around Unknown or Suppressed Table Values
U N E C E Schematic N-D Solution Space Surrounding True Table Values Solution Space Defined by Lower and Upper Bounds on Suppressed Table Cells Typically Multiple Solutions Satisfying Ax=b Exist Safe distance away from edges
U N E C E Two families of solution techniques are in wide use today. Both visit a progressively improving series of trial solutions, until a solution is reached that satisfies the conditions for an optimum. Who needs it?
U N E C E Simplex methods, introduced by Dantzig about 50 years ago, visit "basic" solutions computed by fixing enough of the variables at their bounds to reduce the constraints Ax = b to a square system, which can be solved for unique values of the remaining variables. Basic solutions represent extreme boundary points of the feasible region defined by Ax = b, x >= 0, and the simplex method can be viewed as moving from one such point to another along the edges of the boundary.Dantzig
U N E C E Simplex Solutions Tend to Cluster around Edges of the Solution Space Neutral or Null Objective Function
U N E C E Barrier or interior-point methods, by contrast, visit points within the interior of the feasible region. Creates Real Big Problem for SDL Task
U N E C E Interior Point Solutions Tend to Cluster Towards the Center of the Solution Space Neutral or Null Objective Function
U N E C E Supporting Example From Prof. Jordi Castro min 0 st. x1 + x2 + x3 = 3 x1, x2, x3 > = 0 Interior point methods will provide the solution x1 = x2 = x3 = 1 Simplex methods will provide some xi = 3, the remaining two xj = 0.
U N E C E Table available from
U N E C E
U N E C E Statistical Estimates Additive Point EstimatesAdditive Point Estimates AveragesAverages Frequency DistributionsFrequency Distributions Additive Using AveragesAdditive Using Averages Additive Using Frequency DistributionsAdditive Using Frequency Distributions Using CTA Principles
U N E C E
U N E C E
U N E C E
U N E C E Peak True LowHigh Interval = (high-low)/10. Distance = ABS ( Peak interval – True interval )
U N E C E Cell: 1 True Value: 714. Range: 409. Dif 0 Within: LP audit based range = 409 Peak Density Range TRUE VALUE AND FREQUENCY DISTRIBUTION OF FIRST SUPPRESSED CELL
U N E C E
U N E C E Conclusions and Suggestions Avoid Tighter BoundsAvoid Tighter Bounds Over Protection (not same as over suppression) is not undesired PropertyOver Protection (not same as over suppression) is not undesired Property Use Larger Cells as ComplementsUse Larger Cells as Complements Use of Cost Function 1/( call value) or Log(cell value)/value is preferredUse of Cost Function 1/( call value) or Log(cell value)/value is preferred Synthetic Tabular Data a.k.a. Controlled Tabular Adjustments might be worth Looking in toSynthetic Tabular Data a.k.a. Controlled Tabular Adjustments might be worth Looking in to
U N E C E ADDITIONAL INFORMATION FROM THANK YOU! ADDITIONAL INFORMATION FROM