Presentation is loading. Please wait.

Presentation is loading. Please wait.

Re-development of the Cell Suppression Methodology at the US Census Bureau Philip Steel, James Fagan, Paul Massell, Richard Moore Jr., John Slanta, Bei.

Similar presentations


Presentation on theme: "Re-development of the Cell Suppression Methodology at the US Census Bureau Philip Steel, James Fagan, Paul Massell, Richard Moore Jr., John Slanta, Bei."— Presentation transcript:

1 Re-development of the Cell Suppression Methodology at the US Census Bureau Philip Steel, James Fagan, Paul Massell, Richard Moore Jr., John Slanta, Bei Wang

2 Background Jewett’s network flow program Need for new program 2012 economic census LP (linear programming) methodology R&M cell suppression team

3 Processing Model Preprocessing – Create table description – Determine primaries – Unduplicate Sequential processing of primaries Queue reduction Test company protection (aggregate/supercell) Sequential processing of supercells

4 Table relations Marginals are the sum of interior cells Geographic relationships tend to generate our most complex sets of table relations – State is the sum of metropolitan areas within the state and the balance. – State is also the sum of counties Of the form A=B+..+Z where A,B,…,Z are (one of) rows columns or levels that define some Cartesian integer space (i,j,k) Duplicates are recorded as A=B (eg a county is also a place)

5

6 Objective Function

7 Additivity constraint generator (based on row relations) (b) for ii = 1,..., rr, j = 1,..,cols, k = 1,..., levs : limr(ii) ≥ 1, ws(ii,j,k) = 0

8 Bounds h i,j,k = max(0,v i,j,k ) for i = 1,..., rows, j = 1,..., col, k = 1,..., levs : (i,j,k) ⋲A

9 For the primary

10

11 Skip P Model changes only on the target primary constraints. How can the minimal solution for one target be transformed to be a solution for another target? By applying a scalar that converts the flow through the second P to the fixed value of the model! Can be done when the scalar does not violate the bounding conditions and the complementary flow in the target is 0. I.e. when the solutions flow through the secondary target exceeds its protection requirement.

12 Empirical confirmation In our large sparse tables, we would see a lot of objective 0 results. That is, the solver finds a 0 cost pattern to protect the primary … it is already protected! Skip P eliminated most objective 0 results and left intact the sequence of positive objectives their solutions.

13 Fat solution CPLEX is using a dual simplex method to find solutions. The solutions have a growing 0 cost component, with many more cells than are required to protect the target P. The flow in the 0 cost cells far exceeds what is required to protect the target P (except in very small or dense examples). The solution “lights up” the possible flows in the table’s current state, giving a “fat” solution.

14 Skip P and the fat solution Optimization number Count of P with flow Running total of skipped P 139613076 239523243......... 5871103510448 5881103710449

15 dg10 sector 44 Cartesian cells: 367,605 (2d) Non-zero cells: 159,849 Relations: 283 (row and column) – 14,000 potential tables, linked P: 95,062 LP problems: 10,604 Typical LP size – Reduced LP has 64826 rows, 156809 columns, and 528838 nonzeros Time: 8hr:37min (includes everything)

16 Comparison between network and LP on one (of hundreds) dataset from 2007 Network flowLP C14,55111,283 Cvalue1,813,213,710598,886,234 PubValue12,348,960,57813,563,288,054 (@10%) undersuppressions #0 time24min8hrs 37min Statistics based on unduplicated data with an approximation of a published status flag

17 Thankyou! philip.m.steel@census.gov


Download ppt "Re-development of the Cell Suppression Methodology at the US Census Bureau Philip Steel, James Fagan, Paul Massell, Richard Moore Jr., John Slanta, Bei."

Similar presentations


Ads by Google