Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Multilevel Secure Data Management and its implications to Multilevel semantic web technologies October 27, 2008
Outline What is an MLS/DBMS? Summary of Developments Challenges Data Models Implications for semantic web
Overview of MLS/DBMS What is an MLS/DBMS Need for an MLS/DBMS Users are cleared at different security levels Data in the database is assigned different sensitivity levels--multilevel database Users share the multilevel database MLS/DBMS is the software that ensures that users only obtain information at or below their level In general, a user reads at or below his level and writes at his level Need for an MLS/DBMS Operating systems control access to files; coarser grain of granularity Database stores relationships between data Content, Context, and Dynamic access control Traditional operating systems access control to files is not sufficient Need multilevel access control for DBMSs
Summary of Developments Early Efforts 1975 – 1982; example: Hinke-Shafer approach Air Force Summer Study, 1982 Research Prototypes (Integrity Lock, SeaView, LDV, etc.); 1984 - Present Trusted Database Interpretation; published 1991 Commercial Products; 1988 - Present
Taxonomy for MLS/DBMSs Integrity Lock Architecture: Trusted Filter; Untrusted Back-end, Untrusted Front-end. Checksum is computed by the filter based on data content and security level. Checksum recomputed when data is retrieved. Operating Systems Providing Access Control/ Single Kernel: Multilevel data is partitioned into single level files. Operating system controls access to the filed Extended Kernel: Kernel extensions for functions such as inference and aggregation and constraint processing Trusted Subject: DBMS provides access control to its own data such as relations, tuples and attributes Distributed: Data is partitioned according to security levels; In the partitioned approach, data is not replicated and there is one DBMS per level. In the replicated approach lower level data is replicated at the higher level databases
Integrity Lock
Operating System Providing Mandatory Access Control
Extended Kernel
Trusted Subject
Distributed Approach - I
Distributed Approach II
Some Challenges: Inference Problem Inference is the process of forming conclusions from premises If the conclusions are unauthorized, it becomes a problem Inference problem in a multilevel environment Aggregation problem is a special case of the inference problem - collections of data elements is Secret but the individual elements are Unclassified Association problem: attributes A and B taken together is Secret - individually they are Unclassified
Some Challenges: Polyinstantiation Mechanism to avoid certain signaling channels Also supports cover stories Example: John and James have different salaries at different levels
Some Challenges: Covert Channel Database transactions manipulate data locks and covertly pass information Two transactions T1 and T2; T1 operates at Secret level and T2 operates at Unclassified level Relation R is classified at Unclassified level T1 obtains read lock on R and T2 obtains write lock on R T1 and T2 can manipulate when they request locks and signal one bit information for each attempt and over time T1 could covertly send sensitive information to T1
Multilevel Secure Data Model: Classifying Databases
Multilevel Secure Data Model: Classifying Relations
Multilevel Secure Data Model: Classifying Attributes/Columns
Multilevel Secure Data Model: Classifying Tuples/Rows
Multilevel Secure Data Model: Classifying Elements
Multilevel Secure Data Model: Classifying Views
Multilevel Secure Data Model: Classifying Metadata
Status and Directions MLS/DBMSs have been designed and developed for various kinds of database systems including object systems, deductive systems and distributed systems Provides an approach to host secure applications Can use the principles to design privacy preserving database systems Challenge is to host emerging secure applications including semantic web technologies (MLS XML, MLS RDF etc.)
Multilevel Semantic Web Technologies Take RDF as an example What so we classify for RDF?: Triples? Security properties for RDF Schema Query modification with SPARQL Inference problem based on RDF data and reasoning Design of the system – Trusted subject? Extra credit assignment: Design of Multilevel RDF Data Store Potential opportunity for RA position for Spring semester – to implement the design.