Sensitive Data In a Wired World Negative Representations of Data Stephanie Forrest Dept. of Computer Science Univ. of New Mexico Albuquerque, NM

Slides:



Advertisements
Similar presentations
V-Detector: A Negative Selection Algorithm Zhou Ji, advised by Prof. Dasgupta Computer Science Research Day The University of Memphis March 25, 2005.
Advertisements

Scale Free Networks.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
School of Information University of Michigan Network resilience Lecture 20.
Security and Privacy Issues in Wireless Communication By: Michael Glus, MSEE EEL
Search in Power-Law Networks Presented by Hakim Weatherspoon CS294-4: Peer-to-Peer Systems Slides also borrowed from the following paper Path Finding Strategies.
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
Search Engines and Information Retrieval
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
UNIVERSITY OF JYVÄSKYLÄ Resource Discovery Using NeuroSearch Presentation for the Agora Center InBCT-seminar Mikko Vapa, researcher InBCT 3.2.
Control of Personal Information in a Networked World Rebecca Wright Boaz Barak Jim Aspnes Avi Wigderson Sanjeev Arora David Goodman Joan Feigenbaum ToNC.
Subgoal: conduct an in-depth study of critical representation, operator and other choices used for evolutionary program repair at the source code level.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Toward Prevention of Traffic Analysis Fengfeng Tu 11/26/01.
Distributed Network Intrusion Detection An Immunological Approach Steven Hofmeyr Stephanie Forrest Patrik D’haeseleer Dept. of Computer Science University.
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
Chirag N. Modi and Prof. Dhiren R. Patel NIT Surat, India Ph. D Colloquium, CSI-2011 Signature Apriori based Network.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 9.1 Chapter 9 : Social Networks What is a social.
Pushing the Security Boundaries of Ubiquitous Computing ACSF 2006 —————— 13 th July 2006 —————— David Llewellyn-Jones, Madjid Merabti, Qi Shi, Bob Askwith.
` Research 2: Information Diversity through Information Flow Subgoal: Systematically and precisely measure program diversity by measuring the information.
Provable Unlinkability Against Traffic Analysis Amnon Ta-Shma Joint work with Ron Berman and Amos Fiat School of Computer Science, Tel-Aviv University.
Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms David Chaum CACM Vol. 24 No. 2 February 1981 Presented by: Adam Lee 1/24/2006 David.
Distributed Denial of Service CRyptography Applications Bistro Presented by Lingxuan Hu April 15, 2004.
Secure Cloud Database using Multiparty Computation.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
+ Mayukha Bairy Disk Intersection graphs and CDS as a backbone in wireless ad hoc networks.
黃福銘 (Angus F.M. Huang) ANTS Lab, IIS, Academia Sinica TrajPattern: Mining Sequential Patterns from Imprecise Trajectories.
m-Privacy for Collaborative Data Publishing
Blind Pattern Matching Attack on Watermark Systems D. Kirovski and F. A. P. Petitcolas IEEE Transactions on Signal Processing, VOL. 51, NO. 4, April 2003.
Reasoning about Information Leakage and Adversarial Inference Matt Fredrikson 1.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
CSCE 201 Web Browser Security Fall CSCE Farkas2 Web Evolution Web Evolution Past: Human usage – HTTP – Static Web pages (HTML) Current: Human.
A compression-boosting transform for 2D data Qiaofeng Yang Stefano Lonardi University of California, Riverside.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Mining Document Collections to Facilitate Accurate Approximate Entity Matching Presented By Harshda Vabale.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
PatternHunter: A Fast and Highly Sensitive Homology Search Method Bin Ma Department of Computer Science University of Western Ontario.
Privacy Preserving Payments in Credit Networks By: Moreno-Sanchez et al from Saarland University Presented By: Cody Watson Some Slides Borrowed From NDSS’15.
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Security in Outsourced Association Rule Mining. Agenda  Introduction  Approximate randomized technique  Encryption  Summary and future work.
m-Privacy for Collaborative Data Publishing
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
An Effective Method to Improve the Resistance to Frangibility in Scale-free Networks Kaihua Xu HuaZhong Normal University.
Location Privacy Protection for Location-based Services CS587x Lecture Department of Computer Science Iowa State University.
ICS 353: Design and Analysis of Algorithms Backtracking King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.
An Improved Acquaintance Immunization Strategy for Complex Network.
networks and the spread of computer viruses Authors:M. E. J. Newman, S. Forrest, and J. Balthrop. Published:September 10, Physical Review.
` Question: How do immune systems achieve such remarkable scalability? Approach: Simulate lymphoid compartments, fixed circulatory networks, cytokine communication.
Talal H. Noor, Quan Z. Sheng, Lina Yao,
University of Texas at El Paso
Efficient Multi-User Indexing for Secure Keyword Search
Introduction to Up2U Peter Szegedi - GÉANT
TRUST Area 3 Overview: Privacy, Usability, & Social Impact
Relational Algebra Chapter 4, Part A
Relational Algebra Chapter 4, Sections 4.1 – 4.2
PBKM: A Secure Knowledge Management Framework
2018, Spring Pusan National University Ki-Joune Li
Space-for-time tradeoffs
Software Development Cycle
Presentation transcript:

Sensitive Data In a Wired World Negative Representations of Data Stephanie Forrest Dept. of Computer Science Univ. of New Mexico Albuquerque, NM

Introduction Goal: Develop new approaches to data security and privacy that incorporate design principles from living systems: –Survivability and evolvability –Autonomy –Robustness, adaptation and self repair –Diversity Extends earlier work on computational properties of the immune system: –Intrusion detection –Automated response –Collaborative information filtering

Project Overview Immunology and data: –Negative representations of information Epidemiology and the Internet: –Social networks matter –The real world is not always scale free The social utility of privacy: –Why is privacy an important value in democratic societies? –Evolutionary perspective

Collaborations Paul Helman and Cris Moore (UNM) Robert Axelrod and Mark Newman (Univ. Michigan) Matthew Williamson (Sana Security) Rebecca Wright and Michael de Mare (Stevens) Joan Feigenbaum and Avi Silberschatz (Yale) –Fernando Esponda’s post-doc next year.

How the Immune System Distributes Detection Advantages of distributed negative detection: –Localized (no communication costs) –Scalable and tunable –Robust (no single point of failure) –Private Many small detectors matching nonself (negative detection). Each detector matches multiple patterns (generalization).

Applications to Computing Anomaly detectors earlier work Information filters earlier work Adaptive queries future Negative representations in progress –A positive set DB is a set of fixed length strings. –A negative set NDB represents all the strings not in DB. –Intuition: If an adversary obtains a string from NDB, little information is revealed. Example: –U= All possible four character strings –DB={juan, eric, dave} –U-DB={aaaa, aaab, cris, john, luca, raul, tehj, tosh,.…} –There are = strings in U-DB.

Results Can U-DB be represented efficiently, given |U-DB| >> |DB| ? –YES: There is an algorithm that creates an NDB of size polynomial in DB. –Strategy: Compress information using don’t care symbol. Other representations? What properties does the representation have? –Membership queries are tractable (linear time even without indexing). Other queries, information leakage are future work. –Inferring information from a subset of NDB (next slide). –Inferring DB from NDB is NP-Hard (note: not doing crypto): Currently investigating instance difficulty. Algorithms for increasing instance difficulty. On-line insert/delete algorithms preserve problem difficulty. Collaborations with R. Wright, M. de Mare, and C. Moore. DBU-DBNDB * * *

What information is revealed by queries? (without assuming irreversibility) Having access to a subset of NDB (or DB) yields some information about strings outside that subset: –Assume NDB (or DB) is partitioned into n subsets. To the query “Is x in DB,” what do I learn about x if x is not in my subset? –Must consult n subsets of NDB to conclude that x is in DB. –Must consult the subsets only until x is found (on average n/2). –Assumes that we care more about DB than U-DB. Probability and information content as the membership of strings is revealed. DB contains 10% of all possible L-length strings (formulas).formulas

Private Set Intersection Determine which records are in the intersection of several databases i.e. –DB 1  DB 2  …  DB n –  (NDB 1  NDB 2  …  NDB n ) Each party may compute the intersection –DB i  (NDB 1  NDB 2  …  NDB n ) Party i learns only the intersection of all the sets, And not the cardinality of the other sets.

Results cont. How might these properties be useful? –Protect data from insider attacks –Computing set intersections –Surveys involving sensitive information –Anonymous digital credentials –Fingerprint databases –Other ideas? Prototype implementations: –Perl, C – –See demo

Computer Epidemiology Justin Balthrop, Mark Newman, Matt Williamson Information spreads over networks of social contacts between computers: – address books. –URL links. Network topology affects the rate and extent of spreading: –Epidemiological models, and the epidemic threshold. Controlling spread on scale-free networks: –Random vaccination is ineffective (e.g., anti-virus software). –Targeted vaccination of high-connectivity nodes. –Control degree distribution in time rather than space. Science 304: (2004)

The Social Utility of Privacy Robert Axelrod and Ryan Gerety Typical framing: –Privacy values should remain as is (e.g., Lessig). –Individual rights vs. state (i.e., civil liberties vs. community safety / crime). A community may have its own interest in defending individual privacy (and not), independent of the civil liberties argument: –To promote innovation in changing environments. –To cope with distortions (e.g., overconfidence of middle managers). –To compensate for overgeneralized norms. Not necessarily advocating more privacy: –From a societal/informational point of view how should appropriate bounds on privacy be determined? Current status: –Exploratory modeling based on simple games.

Next Steps: Negative Representations Distributed negative representations Leaking partial information Relational algebra operators on the negative database: –Select, join, etc. Instance difficulty: –Hiding given satisfying assignments in a SAT formula –Approximate representations –Other representations? More realistic implementations Negative data mining: –Is it easier/harder to find certain instances in NDB? Imprecise representations: –Partial matching and queries –Learning algorithms

People Stephanie Forrest Elena Ackley Fernando Esponda Paul Helman

Publications F. Esponda, S. Forrest, and P. Helman ``Negative representations of information.'' International Journal of Information Security (submitted March 2005). F. Esponda, E.~S. Ackley, S. Forrest, and P. Helman ``On-line negative databases.'' Journal of Unconventional Computing (in press). F. Esponda, S. Forrest, and P. Helman. ``A formal framework for positive and negative detection.'' IEEE Transactions on Systems, Man, and Cybernetics 34:1 pp (2004). J. Balthrop, S. Forrest, M. Newman, and M. Williamson.``Technological networks and the spread of computer viruses.'’ Science 304: (2004). H. Inoue and S. Forrest ``Inferring Java security policies through dynamic sandboxing.'' "2005 International Conference on Programming Languages and Compilers (PLC'05) (in press). F. Esponda, E. Ackley, S. Forrest, and P. Helman. ``On-line negative databases.'' Third International Conference on Artificial Immune Systems (ICARIS) Best paper award (2004).

SUPPLEMENTARY MATERIAL

Probabilities BACK

Generating Hard-to-Reverse Negative Databases The randomized algorithm can be used to create a negative database. Insert/Delete operations turn known hard formulas into negative databases. The Morph operator may be used to search for hard instances. H. Jia, C. Moore and B. Selman "From spin glasses to hard satisfiable formulas” SAT 2004.

Effect of the Morph operation The Morph operation takes as input a negative database NDB and outputs NDB’ that represents the same set U-DB. The plot shows how the complexity of a database changes after applying the morph operator.