Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantics for Privacy and Context Tim Finin University of Maryland, Baltimore County Joint work with Anupam Joshi, Prajit Das, Primal Pappachan, Eduado.

Similar presentations


Presentation on theme: "Semantics for Privacy and Context Tim Finin University of Maryland, Baltimore County Joint work with Anupam Joshi, Prajit Das, Primal Pappachan, Eduado."— Presentation transcript:

1 Semantics for Privacy and Context Tim Finin University of Maryland, Baltimore County Joint work with Anupam Joshi, Prajit Das, Primal Pappachan, Eduado Mena and Roberto Yus http://ebiq.org/r/363

2 The plot outline Today’s focus on big data requires semantics → Variety → Need for integration & fusion → Must understand data semantics → Use semantic languages & tools (reasoners, ML) → Have shared ontologies & background knowledge Relevance to privacy and security – Protect personal information, esp. in mobile/IOT – Understanding and using context is often useful if not critical – Security relevant as as intrusions lead to loss of privacy

3 Use Case Examples We’ve used semantic technologies in support of assured information tasks including – Representing & enforcing information sharing policies – Negotiating for cloud services respecting organizational constraints (e.g., data privacy, location, …) – Modeling context for mobile users and using this to manage information sharing – Acquiring, using and sharing knowledge for situationally-aware intrusion detection systems Key technologies include Semantic Web languages (OWL, RDF) and tools and information extraction from text

4 Context-Aware Privacy & Security Smart mobile devices know a great deal about their users, including their current context Sensor data, email, calendar, social media, … Acquiring & using this knowledge helps them provide better services Context-aware policies can be used to limit information sharing as well as to control the actions and information access of mobile apps Sharing context with other users, organizations and service providers can also be beneficial Context is more than time and GPS coordinates We’re in a two-hour budget meeting at X with A, B and C We’re in a impor- tant meeting We’re busy http://ebiq.org/p/589

5 Simple Context Ontology Light-weight, upper level context OWL ontology Centered around the concepts for: users, conceptual places, geo- places, activities, roles, space, and time Conceptual places such as at work and at home Activities occur at places & involve users filling roles LOD resources provide background knowledge

6 Context / situation recognition Train Classifiers Decision Trees Naïve Bayes SVM Feature Vector Time, Noise level in db (avg, min, max), accel 3 axis (avg, min, max, magnitude, wifis, … Train HMM models

7 Context-aware Privacy Policies We use declarative policies that can access the user’s profile and context model for privacy and security One use is to control what information we share with whom and in what context Another is to control the actions that an app can take (e.g., enable camera, access SD card) depending on the context A third is to obfuscate some shared information (e.g., location)

8 Context-aware Policies for Sharing Android's policies are limited Privacy controls in existing applications are limited – Friends Only and Invisible restrictions common – Not context-dependent but static and pre- determined Controls to share other data largely non-existent

9 Context-aware Policies for Sharing Android's policies are very limited Privacy controls in existing location sharing applications are limited – Friends Only and Invisible restrictions common – Not context-dependent but static and pre- determined Controls to share other data largely non-existent Static Information Aspects of Context Generalization of Context Temporal Restrictions Context Restrictions Requester’s Context

10 Location Generalization GeoNames spatial containment knowledge from the LOD cloud is used when populating the KB – Share my location with manager on weekdays from 9am-5pm User’s exact location in terms of GPS co-ordinates is shared The user may prohibit sharing GPS co-ordinates but permit sharing city-level location – Share my building-wide location with co workers not in my team on weekdays from 9am-5pm – Do not share location on weekends.

11 Location Generalization GeoNames spatial containment knowledge from the LOD cloud is used when populating the KB – Share my location with teachers on weekdays from 9am-5pm User’s exact location in terms of GPS co-ordinates is shared The user may prohibit sharing GPS co-ordinates but permit sharing city-level location – Share my building-wide location with teachers on weekdays from 9am-5pm

12 Activity Generalization – Share my activity with friends on weekends User’s current activity shared with friends on weekends Share more generalized activity rather that precise confidential project meeting => Office Meeting => Working => Busy, Date => Meeting Friends – User clearly needs to obfuscate certain pieces of activity information to protect her context info – Share my public activity with friends on weekends Public is a visibility option

13 Activity Generalization – Share my activity with friends on weekends User’s current activity shared with friends on weekends Share more generalized activity rather that precise confidential project meeting => Working, Date => Meeting – User clearly needs to obfuscate certain pieces of activity information to protect her context info – Share my public activity with friends on weekends Public is a visibility option

14 Context-aware power management Maintaining context model uses power We empirically determine power usage for a phone’s sensors and use this for optimization

15 Context-aware power management Maintaining the context model use power We developed an accurate power models for a phone’s sensors and use this for optimization When updating context model 1. Only enable sensors required by policy, reuse recent sensor readings whenever appropriate e.g., disable GPS sensor when at home in evening 2. Prefer sensors with lower energy footprint or already in use when several available e.g., Choose Wifi to GPS for location at office during day 3.Reorder rule conditions to reduce energy use e.g., Check conditions requiring no sensor access first When updating context model 1. Only enable sensors required by policy, reuse recent sensor readings whenever appropriate e.g., disable GPS sensor when at home in evening 2. Prefer sensors with lower energy footprint or already in use when several available e.g., Choose Wifi to GPS for location at office during day 3.Reorder rule conditions to reduce energy use e.g., Check conditions requiring no sensor access first http://ebiq.org/p/632

16 Collaborative Context Sharing Like Blanche DuBois, we have always depended on the kindness of strangers We are cooperative & ask one another for info. – Stanger on the street: Does this bus go to the aquarium? – Random classmate in next seat: When is HW6 due? Devices can use ad hoc networks (e.g., Bluetooth) to query nearby devices for desired information Each device uses a policy for what triples it’s willing to share with whom in what context  Mobile Ad Hoc Knowledge Network

17 Collaboratively Constructed Contexts A co-located group of devices can collaborate to share some context information – Exploit their different sensors and context detection/modeling capabilities – Consensus modeling can improve accuracy and overcome errors & malicious misinformation Policies and context determine what to share with whom and in what context We’ve designed an approach to detect/create groups and share information and used an Android prototype for simple evaluations

18 Collaborative Context Use Case Four GCC students with five devices in GCC library. All what to know where they are and what they’re doing

19 Collaborative Context Use Case Abed, Annie & Jeff are in a study group. Jeff has a phone and tablet. Pierce just happens to be there.

20 Collaborative Context Use Case Jeff’s phone knows it in room 7 and that he’s talking; Annie’s tablet think’s she’s at home.

21 Context Sharing With help from context synthesizers, participants can have an appropriate consensus model Study group (Abed, Annie, Jeff): “study group about Spanish, duration of one hour, partici- pants: Jeff, Abed, Annie” In room (all): “in study room 7, in Greendale Community College, temp: 25 o C, lights on” Jeff's devices: + "heart_rate:70bpm"

22 Context Ontology Assume devices use a shared, ontology for context Prototype uses JFact for DL reasoning on Android devices

23 Architecture Context providers have information to share Context synthesizers integrate, de-conflict & enrich data Prototype uses secure communication over Bluetooth

24 Context Groups Context synthesizer recognizes groups and creates default groups Predefined (e.g., ACM student chapter) Default groups created for identity, location and activity Provider’s own policies control what is shared with a group

25 Context integration and reconciliation coments

26 Faceblock http://ebiq.org/p/666 Click image to play 80 second video or go to YoutubeYoutube

27 Conclusion Google’s new slogan: things, not strings We can construct context models in semantic languages using data from sensors, calendars and other sources Semantic policies for information sharing can manage what is shared with whom and in what context Additional protocols and infrastructure will permit dynamic collaborative context models http://ebiq.org/r/363

28 Intrusion Detection Systems Current intrusion detection systems poor for zero-day and “low and slow” attacks, and APTs Sharing Information from heterogeneous data sources can provide useful information even when an attack signature is unavailable Implemented prototypes that integrate and reason over data from IDSs, host and network scanners, and text at the knowledge level We’ve established the feasibility of the approach in simple evaluation experiments

29 From dashboards & watchstanding (Simple) Analysis

30 Threat/Vulnerability Alert Knowledge Base ReasonerOntology Domain Expert Knowledge RDFS Knowledge Web Text Sources (Blogs, Forums, Feeds) Entity/Concept Extractor Named Entities Security Vulnerability Entities Extractor Security Vulnerability Terms IDS/IPS sensors Reports and Logs Host Based Activity Monitor Host Activity Logs Network Activity Monitor Network Activity Logs Hardware Security Sensors Security Logs System Architecture 2 http://ebiq.org/p/604

31 … to situational awareness Non Traditional “Sensors” Traditional Sensors Facts / Information Context/Situation Rules Policies Analytics Alerts Use-after-free vulnerability in Microsoft Internet Explorer 6 through 8 …. [ a IDPS:text_entity; IDPS:has_vulnerability_term "true"; IDPS:has_security_exploit "true"; IDPS:has_text “Internet Explorer"; IDPS:has_text “arbitrary code "; IDPS:has_text "remote attackers".] [ a IDPS:system; IDPS:host_IP "130.85.93.105”.] [ a IDPS:scannerLog IDPS:scannerLogIP "130.85.93.105"; …] [ a IDPS:gatewayLog IDPS:gatewayLogIP "130.85.93.105"; …] [ IDPS:scannerLog IDPS:hasBrowser ?Browser IDPS:gatewayLog IDPS:hasURL ?URL ?URL IDPS:hasSymantecRating “unsafe” IDPS: scannerLog IDPS:hasOutboundConnection “true” IDPS:WiresharkLog IDPS:isConnectedTo ?IPAddress ?IPAddress IDSP:isZombieAddress “true”] => [IDPS:system IDPS:isUnderAttack “user-after-free vulnerability” IDPS:attack IDPS:hasMeans “Backdoor” IDPS:attack IDPS:hasConsequence “UnautorizedRemoteAccess”] http://ebiq.org/p/604

32 Maintaining the vulnerability KB Our approach requires us to keep the KB of software products and known or suspected vulnerabilities and attacks up to date Resources like NVD are great, but tapping into text can enrich their info and give earlier warn-ings of problems CVE disclosed (01/14/13) Vendor deploys software Attacker finds vuln. & exploits it (01/10/13) Exploit reported in mailing list (01/10/13) Vuln. reported in NVD RSS feed Analysis Vuln. Analyzed & included in NVD feed (02/16/2013) Vendor Analysis Threat disclosed in vendor bulletin (03/04/2013) Patch development Patch released (Critical Patch Update) (06/18/2013) Resolution System update

33 Information extraction from text CVE-2012-0150 Buffer overflow in msvcrt.dll in Microsoft Windows Vista SP2, Windows Server 2008 SP2, R2, and R2 SP1, and Windows 7 Gold and SP1 allows remote attackers to execute arbitrary code via a crafted media file, aka ”Msvcrt.dll Buffer Overflow Vulnerability.” ebqids:hasMean s Identify relationships http://dbpedia.org/resourc e/Buffer_overflow Link concepts to entities http://dbpedia.org/resource/Wind ows_7 ebqids:affectsProduct http://dbpedia.org/resource/Arbitrary_code_execution We use information extraction techniques to identify entities, relations and concepts in security related text These are mapped to terms in our ontology and the DBpedia LOD KB (based on Wikipedia) Google’s slogan: “Things, not strings”

34 Security Bulletins Blogs Maintaining the vulnerability KB Unstructured Data (Vuln. Summaries) Entity & Concept Spotter Extracted Concepts Web Text Triple Store NVD dataset Structured Data (XML) IDS Ontology Linked Cybersecurity Data Consumers Linking & Mapping Entities RDF Generation http://ebiq.org/p/629

35 Populating KBs from Text Kelvin is a system for populating KBs with entities and relations extracted from text – Developed at JHU Human Language Technology Center of Excellence – E.g., extracts 300K entities and 3M relations from 50K newswire articles Supports analytics at KB level: inference, proba- bilistic reasoning, entities linking across KBs, … Top system in 2012 & 2013 NIST Text Analytics Conference Coldstart KBP task evaluations http://ebiq.org/p/671

36 Faceblock Ontology Faceblock’s (OWL) ontology lets one to write context policy rules using predefined activity and place types

37 Faceblock Ontology Faceblock’s (OWL) ontology lets one to write context policy rules using predefined activity and place types

38 Faceblock Protocols User device maintains context, reasons with policy rules and informs glass devices of Faceblock property: True or Fase

39 Taming Wild Big Data WBD is structured or semi-structured data for which we lack schema-level understanding – e.g, raw tables, graphs, xml, logs Developed tools to generate semantic data from background ontologies & KBs, e.g. for clinical trial tables It’s harder when the domain is not even known. We’re developing systems that use large background KBs (e.g., Google’s Freebase) to predict types/subtypes of data instances http://ebiq.org/p/672http://ebiq.org/p/661


Download ppt "Semantics for Privacy and Context Tim Finin University of Maryland, Baltimore County Joint work with Anupam Joshi, Prajit Das, Primal Pappachan, Eduado."

Similar presentations


Ads by Google