Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Guest Lecture Lecture #27 Cyber Crime, Solutions, Privacy and the Semantic Web April 19, 2005
Outline l Cyber Crime l Some Solutions l Privacy l Secure Semantic Web
Types of Cyber Crime Security Threats and Violations Access Control Violations Integrity/ Privacy Violations Fraud/ Identity Theft Denial of Service/ Infrastructure Attacks Sabotage Confidentiality Authentication Nonrepudiation Violations
Some Solutions l Access Control Models l Digital Identity Management l Identity Theft Management l Digital Forensics l Digital Watermarking l Risk Analysis l Encryption l Biometrics
Types of in Access Control l Inference Problem and Access Control - Inference problem occurs when users pose queries and deduce unauthorized information from the legitimate responses - Security constraint processing for controlling inferences l Temporal Access Control Models - Incorporates time parameter into the access control models l Role-based access control - Controlling access based on roles of people and the activities they carry out; Implemented in commercial systems l Positive and Negative Authorizations - Should negative authorizations be explicitly specified? How can conflicts be resolved? l Usage Control - Policies of authorizations, Obligations and Conditions
Inference and Access Control: Security Constraint Processing User Interface Manager Constraint Manager Security Constraints Query Processor: Constraints during query and release operations Update Processor: Constraints during update operation Database Design Tool Constraints during database design operation Database Relational DBMS
Digital Identity Management l Digital identity is the identity that a user has to access an electronic resource l A person could have multiple identities - A physician could have an identity to access medical resources and another to access his bank accounts l Digital identity management is about managing the multiple identities - Manage databases that store and retrieve identities - Resolve conflicts and heterogeneity - Make associations - Provide security l Ontology management for identity management is an emerging research area
Digital Identity Management - II l Federated Identity Management - Corporations work with each other across organizational boundaries with the concept of federated identity - Each corporation has its own identity and may belong to multiple federations - Individual identity management within an organization and federated identity management across organizations l Technologies for identity management - Database management, data mining, ontology management, federated computing
Digital Identity Management – III What is going on in this area? l Private Sector Activity - Microsoft Passport, Liberty Alliance l Public Sector Activity - Federal Executive, State Executive l Some Public and Private Systems - E-Tailing and User names, E-Government and Integration, Government interest in Single Identity, Fair Information Practices: Citizens managing their own identity l Approaches - Single Federal National System, State Federated System, Systemic Uniformity l Source: Identity Management White paper by the National Electronic Commerce Coordinating Council, December 2002
Identity Theft Management l Need for secure identity management - Ease the burden of managing numerous identities - Prevent misuse of identity: preventing identity theft l Identity theft is stealing another person’s digital identity l Techniques for preventing identity thefts include - Access control, Encryption, Digital Signatures - A merchant encrypts the data and signs with the public key of the recipient - Recipient decrypts with his private key
Digital Forensics l Digital forensics is about the investigation of Cyber crime l Follows the procedures established for Forensic medicine l The steps include the following: - When a computer crime occurs, law enforcement officials who are cyber crime experts gather every piece of evidence including information from the crime scene (i.e. from the computer) - Gather profiles of terrorists - Use history information - Carry pout analysis
Digital Forensics - II l Digital Forensics Techniques - Intrusion detection - Data Mining - Analyzing log files - Use criminal profiling and develop a psychological profiling - Analyze messages l Lawyers, Psychologists, Sociologists, Crime investigators and Technologists have to worm together l International Journal of Digital Evidence is a useful source
Steganography and Digital Watermarking l Steganography is about hiding information within other information - E.g., hidden information is the message that terrorist may be sending to their pees in different parts of the worlds - Information may be hidden in valid texts, images, films etc. - Difficult to be detected by the unsuspecting human l Steganalysis is about developing techniques that can analyze text, images, video and detect hidden messages - May use data mining techniques to detect hidden patters l Steganograophy makes the task of the Cyber crime expert difficult as he/she ahs to analyze for hidden information - Communication protocols are being developed
Steganography and Digital Watermarking - II l Digital water marking is about inserting information without being detected for valid purposes - It has applications in copyright protection - A manufacturer may use digital watermarking to copyright a particular music or video without being noticed - When music is copies and copyright is violated, one can detect two the real owner is by examining the copyright embedded in the music or video
Risk Analysis l Analyzing risks - Before installing a secure system or a network one needs to conduct a risk analysis study - What are the threats? What are the risks? l Various types of risk analysis methods - Quantitative approach: Events are ranked in the order of risks and decisions are made based on then risks Qualitative approach: estimates are used for risks l Security vs Cost - If risks are high and damage is significant then it may be worth the cost of incorporating security - If risks and damage are not high, then security may be an additional cost burden
Encryption: Secure Web Service Architecture Confidentiality, Authenticity, Integrity Service requestor Service provider UDDI Query BusinessEntity BusinessService BindingTemplate BusinessService tModel PublisherAssertion Owner encrypts documents with his/her private key; Use of Merkle Signatures for further protection
Biometrics l Early Identication and Authentication (I&A) systems, were based on passwords l Recently physical characteristics of a person are being sued for identification - Fingerprinting - Facial features - Iris scans - Blood circulation - Facial expressions l Biometrics techniques will provide access not only to computers but also to building and homes l Other Applications
Biometric Technologies l Pattern recognition l Machine learning l Statistical reasoning l Multimedia/Image processing and management l Managing biometric databases l Information retrieval l Pattern matching l Searching l Ontology management l Data mining
Secure Biometrics l Biometrics systems have to be secure l Need to study the attacks for biometrics systems l Facial features may be modified: - E.g., One can access by inserting another person’s features - Attacks on biometric databases is a major concern l Challenge is to develop a secure biometric systems - Policy, Model, Architecture - Need to maintain privacy of the individuals as appropriate
Relationships between Dependability, Confidentiality, Privacy, Trust Dependability Confidentiality Privacy Trust Dependability: Security, Privacy, Trust, Real-time Processing, Fault Tolerance; also sometimes referred to as “Trustworthiness” Confidentiality: Preventing the release of unauthorized information considered sensitive Privacy: Preventing the release of unauthorized information about individuals considered sensitive Trust: Confidence one has that an individual will give him/her correct information or an individual will protect sensitive information
Some Privacy concerns l Medical and Healthcare - Employers, marketers, or others knowing of private medical concerns l Security - Allowing access to individual’s travel and spending data - Allowing access to web surfing behavior l Marketing, Sales, and Finance - Allowing access to individual’s purchases
Data Mining as a Threat to Privacy l Data mining gives us “facts” that are not obvious to human analysts of the data l Can general trends across individuals be determined without revealing information about individuals? l Data Mining is a critical application for National Security and Intrusion Detection l Possible threats due to data mining: - Combine collections of data and infer information that is private l Disease information from prescription data l Military Action from Pizza delivery to pentagon l Need to protect the associations and correlations between the data that are sensitive or private
Some Privacy Problems and Potential Solutions l Problem: Privacy violations that result due to data mining - Potential solution: Privacy-preserving data mining l Problem: Privacy violations that result due to the Inference problem - Inference is the process of deducing sensitive information from the legitimate responses received to user queries - Potential solution: Privacy Constraint Processing l Problem: Privacy violations due to un-encrypted data - Potential solution: Encryption at different levels l Problem: Privacy violation due to poor system design - Potential solution: Develop methodology for designing privacy- enhanced systems
Some Directions: Privacy Preserving Data Mining l Prevent useful results from mining - Introduce “cover stories” to give “false” results - Only make a sample of data available so that an adversary is unable to come up with useful rules and predictive functions l Randomization - Introduce random values into the data and/or results - Challenge is to introduce random values without significantly affecting the data mining results - Give range of values for results instead of exact values l Secure Multi-party Computation - Each party knows its own inputs; encryption techniques used to compute final results - Rules, predictive functions l Approach: Only make a sample of data available - Limits ability to learn good classifier
Platform for Privacy Preferences (P3P): What is it? l P3P is an emerging industry standard that enables web sites t9o express their privacy practices in a standard format l The format of the policies can be automatically retrieved and understood by user agents l It is a product of W3C; World wide web consortium l When a user enters a web site, the privacy policies of the web site is conveyed to the user l If the privacy policies are different from user preferences, the user is notified l User can then decide how to proceed l Being Adopted by the Semantic Web Community
Layered Architecture for Dependable Semantic Web 0 Some Challenges: Interoperability between Layers; Security and Privacy cut across all layers; Integration of Services; Composability XML, XML Schemas Rules/Query Logic, Proof and Trust SECURITYSECURITY Other Services RDF, Ontologies URI, UNICODE PRIVACYPRIVACY 0 Adapted from Tim Berners Lee’s description of the Semantic Web
Rule Processing Policies Ontologies Rules Semantic Web Engine XML, RDF Documents Web Pages, Databases Inference Engine/ Rules Processor Interface to the Semantic Web Technology By W3C
Vision for Cyber Security: Securing the Semantic Web Core Semantic Web Technologies: Systems, Networks, Agents, AI, Machine Learning, Data Mining, Languages, Software Engineering, Information Integration Need research to bring together the above technologies Directions: Security/Trust/Privacy, Integrate sensor technologies, Pervasive computing, Social impact Domain specific semantic webs: DoD, Intelligence, Medical, Treasury, Some Challenges: Secure Semantic Interoperability; Secure Information Integration; Integrating Pervasive computing and sensors