Presentation is loading. Please wait.

Presentation is loading. Please wait.

Security Methods for Statistical Databases. Introduction  Statistical Databases containing medical information are often used for research  Some of.

Similar presentations


Presentation on theme: "Security Methods for Statistical Databases. Introduction  Statistical Databases containing medical information are often used for research  Some of."— Presentation transcript:

1 Security Methods for Statistical Databases

2 Introduction  Statistical Databases containing medical information are often used for research  Some of the data is protected by laws to help protect the privacy of the patient  Proper security precautions must be implemented to comply with laws and respect the sensitivity of the data

3 Accuracy vs. Confidentiality Accuracy – Researchers want to extract accurate and meaningful data Confidentiality – Patients, laws and database administrators want to maintain the privacy of patients and the confidentiality of their information

4 Laws  Health Insurance Portability and Accountability Act – HIPAA (Privacy Rule)  Covered organizations must comply by April 14, 2003  Designed to improve efficiency of healthcare system by using electronic exchange of data and maintaining security  Covered entities (health plans, healthcare clearinghouses, healthcare providers) may not use or disclose protected information except as permitted or required  Privacy Rule establishes a “minimum necessary standard” for the purpose of making covered entities evaluate their current regulations and security precautions

5 HIPAA Compliance  Companies offer 3 rd Party Certification of covered entities  Such companies will check your company and associating companies for compliance with HIPAA  Can help with rapid implementation and compliance to HIPAA regulations

6 Types of Statistical Databases  Static – a static database is made once and never changes  Example: U.S. Census  Dynamic – changes continuously to reflect real-time data  Example: most online research databases

7 Security Methods  Access Restriction  Query Set Restriction  Microaggregation  Data Perturbation  Output Perturbation  Auditing  Random Sampling

8 Access Restriction  Databases normally have different access levels for different types of users  User ID and passwords are the most common methods for restricting access  In a medical database:  Doctors/Healthcare Representative – full access to information  Researchers – only access to partial information (e.g. aggregate information)

9 Query Set Restriction  A query-set size control can limit the number of records that must be in the result set  Allows the query results to be displayed only if the size of the query set satisfies the condition  Setting a minimum query-set size can help protect against the disclosure of individual data

10 Query Set Restriction  Let K represents the minimum number or records to be present for the query set  Let R represents the size of the query set  The query set can only be displayed if K  R

11 Query Set Restriction

12 Microaggregation  Raw (individual) data is grouped into small aggregates before publication  The average value of the group replaces each value of the individual  Data with the most similarities are grouped together to maintain data accuracy  Helps to prevent disclosure of individual data

13 Microaggregation  National Agricultural Statistics Service (NASS) publishes data about farms  To protect against data disclosure, data is only released at the county level  Farms in each county are averaged together to maintain as much purity, yet still protect against disclosure

14 Microaggregation

15

16 Data Perturbation  Perturbed data is raw data with noise added  Pro: With perturbed databases, if unauthorized data is accessed, the true value is not disclosed  Con: Data perturbation runs the risk of presenting biased data

17 Data Perturbation

18 Output Perturbation  Instead of the raw data being transformed as in Data Perturbation, only the output or query results are perturbed  The bias problem is less severe than with data perturbation

19 Output Perturbation Query Results

20 Auditing  Auditing is the process of keeping track of all queries made by each user  Usually done with up-to-date logs  Each time a user issues a query, the log is checked to see if the user is querying the database maliciously

21 Random Sampling  Only a sample of the records meeting the requirements of the query are shown  Must maintain consistency by giving exact same results to the same query  Weakness - Logical equivalent queries can result in a different query set

22 Comparison Methods  Security  Security – possibility of exact disclosure, partial disclosure, robustness  Richness of Information  Richness of Information – amount of non-confidential information eliminated, bias, precision, consistency  Costs  Costs – initial implementation cost, processing overhead per query, user education The following criteria are used to determine the most effective methods of statistical database security:

23 A Comparison of MethodsMethodSecurity Richness of Information Costs Query-set Restriction Low Low 1 Low MicroaggregationModerateModerateModerate Data Perturbation HighHigh-ModerateLow Output Perturbation ModerateModerate-lowLow AuditingModerate-LowModerateHigh SamplingModerateModerate-LowModerate 1 Quality is low because a lot of information can be eliminated if the query does not meet the requirements

24 Sources  This presentation is posted on http://www.cs.jmu.edu/users/aboutams http://www.cs.jmu.edu/users/aboutams  Adam, Nabil R. ; Wortmann, John C.; Security- Control Methods for Statistical Databases: A Comparative Study; ACM Computing Surveys, Vol. 21, No. 4, December 1989 (http://delivery.acm.org/10.1145/80000/76895/p515- adam.pdf?key1=76895&key2=1947043301&coll=portal&dl=ACM&CFID=4702747&CFTOKEN=83773110)http://delivery.acm.org/10.1145/80000/76895/p515- adam.pdf?key1=76895&key2=1947043301&coll=portal&dl=ACM&CFID=4702747&CFTOKEN=83773110  Official HIPAA – (http://cms.hhs.gov/hipaa/) incurhttp://cms.hhs.gov/hipaa/  Bernstein, Stephen W.; Impact of HIPAA on BioTech/Pharma Research: Rules of the Road (http://www.privacyassociation.org/docs/3-02bernstein.pdf)http://www.privacyassociation.org/docs/3-02bernstein.pdf  Service Bureau; 3rd Party Testing (http://hipaatesting.com/service_bureau.html)http://hipaatesting.com/service_bureau.html


Download ppt "Security Methods for Statistical Databases. Introduction  Statistical Databases containing medical information are often used for research  Some of."

Similar presentations


Ads by Google