Presentation is loading. Please wait.

Presentation is loading. Please wait.

G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter.

Similar presentations


Presentation on theme: "G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter."— Presentation transcript:

1 G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter Wright

2 2 G-Confid: a cell suppression application  Use with any table size and any number of dimensions (subject to hardware / memory limitations)  Available for SAS 9.2 and 9.3; SAS EG 4.3 and 5.1  PROC SENSITIVITY identifies sensitive cells Highlights, inputs, strategies  Macro SUPPRESS creates a suppression pattern  Inputs, outputs, strategies  Macro AUDIT audits a suppression pattern Overview by component

3 PROC SENSITIVITY identifies confidential cells Highlights:  Choice of sensitivity rule: p-percent, (n,k), arbitrary  Allows multiple decomposition 3 where

4 Inputs for PROC SENSITIVITY  Definition of hierarchy(ies) for each table dimension  Microdata file Classification variables (e.g., geography, industry) Enterprise identifier Enterprise value 4 Tip: to reduce the sensitivity of a cell by the value of an enterprise, set the enterprise identifier = missing

5 Example of SAS code to run PROC SENSITIVITY proc sensitivity data=microfile outconstraint=consfile outcell=cellfile outlargest=largestfile hierarchy="0 East West; 0 1 2 3;" srule=“pq.20" range=“East A B: West C D; 1 101 201 301: 2 102 202 302: 3 103 203 303;" minresp=5; id Enterpriseid; var Income; dimension EastWest Industry; run; 5

6 Strategies using PROC SENSITIVITY  Use the MINRESP=r option to set the minimum number of respondents Any cell with fewer than r respondents is assigned a sensitivity of max{1, S} where S is the sensitivity of the cell Only positive (>0) values are counted as respondents MINRESP rule is ignored for a cell with a value contributed by an anonymous enterprise 6 Note: we can use MINRESP without applying a sensitivity rule

7 Strategies using PROC SENSITIVITY (continued)  To reduce oversuppression, apply rules that make use of sampling weights Example: if the sampling weight w i >3, make the enterprise anonymous (set ID value=missing). G-Confid will use its contribution to reduce the sensitivity of the cell. 7 Find more strategies in: Tambay and Fillion (Proceedings of the JSM 2013)

8 Macro SUPPRESS – complementary suppression  Uses the SAS/OR® LP solver  Input files: (i) cell sensitivities file, and (ii) linear constraints file  Syntax: %Suppress(InCell=, Constraint=, CFunction1=, CFunction2=, CVar1=, CVar2=, OutCell=, ByVars=, OutComplement=, ScaleCost=);  Output file has final status (Suppress, Publish) and the net variation (largest amount the cell was “moved”) 8

9 Strategies using the macro SUPPRESS  Choice of cost functions (functions of cell total) Can run the LP process twice to reduce the number of suppressions (e.g., SIZE or DIGITS, then INFORMATION)  Can favour publishing certain cells by defining higher cost values (by default, cost=tot) 9 SIZE (=tot)DIGITS (=log[tot+1]) CONSTANT (=1)INFORMATION (=log[tot+1]/[tot+1])

10 Macro AUDIT – validates a suppression pattern  Calculates minimum and maximum values for each suppressed cell using LP solver  Provides results for each cell (protection achieved, not achieved, or exact disclosure) 10  Coming soon: pre-set narrower starting intervals than the default values (0.5tot and 1.5tot) using the Shuttle algorithm (Buzzigoli and Giusti (2006)) Using the Shuttle algorithm to pre-set the starting intervals ↓ run time

11 11  PROC SENSITIVITY  Use pre-defined or customized sensitivity rule  Can do multiple decomposition  MINRESP function  Can apply weighting strategies  Macro SUPPRESS  Can favour cells to publish (or suppress)  Macro AUDIT Conclusion Coming soon: additive controlled rounding

12 12  For more information,  Pour plus d’information, please contact:veuillez contacter : Peter Wright Peter.Wright@statcan.gc.ca


Download ppt "G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter."

Similar presentations


Ads by Google