Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Trustworthy Semantic Webs October 2010
Outline Semantic web XML security RDF security Ontologies and Security Rules and Security Reference: Building trustworthy semantic web, Thuraisingham, CRC Press, 2007
From Today’s Web to Semantic web High recall, low precision: Too many web pages resulting in searches, many not relevant Sometimes low recall Results sensitive to vocabulary: Different words even if they mean the same thing do not results in same web pages Results are single web pages not linked web pages Semantic web Machine understandable web pages Activities on the web such as searching with little or no human intervention Technologies for knowledge management, e-commerce, interoperability Solutions to the problems faced by today’s web
Knowledge Management and Personal Agents Corporation Need: Searching, extracting and maintaining information, uncovering hidden dependencies, viewing information Semantic web for knowledge management: Organizing knowledge, automated tools for maintaining knowledge, question answering, querying multiple documents, controlling access to documents Personal Agent John is a president of a company. He needs to have a surgery for a serious but not a critical illness. With current web he has to check each web page for relevant information, make decisions depending on the information provided With the semantic web, the agent will retrieve all the relevant information, synthesize the information, ask John if needed, and then present the various options and makes recommendations
E-Commerce Business to Consumer Users shopping on the web; wrapper technology is used to extract information about user preferences etc. and display the products to the user Use of semantic web: Develop software agents that can interpret privacy requirements, pricing and product information and display timely and correct information to the use; also provides information about the reputation of shops Business to Business Organizations work together and carrying out transactions such as collaborating on a product, supply chains etc. With today’s web lack of standards for data exchange Use of semantic web: XML is a big improvement, but need to agree on vocabulary. Future will be the use of ontologies to agree on meanings and interpretations
Layered Approach: Tim Berners Lee’s Vision www.w3c.org
Credentials in XML <Professor credID=“9” subID = “16: CIssuer = “2”> <name> Alice Brown </name> <university> University of X <university/> <department> CS </department> <research-group> Security </research-group> </Professor> <Secretary credID=“12” subID = “4: CIssuer = “2”> <name> John James </name> <level> Senior </level> </Secretary>
Policies in XML <? Xml VERSION = “1.0” ENCODING = “utf-8”?> <Policy–base> <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = “//Patent[@Dept = ‘CS’]//Node()” priv = “VIEW”/> “annual_ report.xml” path = “//Patent[@Dept = ‘EE’] /Short-descr/Node() and //Patent [@Dept = ‘EE’]/authors” priv = “VIEW”/> <policy-spec cred-expr = - - - - <policy-spec cred-expr = - - -- </Policy-base> Explantaion: CS professors are entitled to access all the patents of their department. They are entitled to see only the short descriptions and authors of patents of the EE department
Access Control Strategy Subjects request access to XML documents under two modes: Browsing and authoring With browsing access subject can read/navigate documents Authoring access is needed to modify, delete, append documents Access control module checks the policy based and applies policy specs Views of the document are created based on credentials and policy specs In case of conflict, least access privilege rule is enforced Works for Push/Pull modes
System Architecture for Access Control User Pull/Query Push/result X-Access X-Admin Admin Tools Credential base Policy base XML Documents
Third-Party Architecture The Owner is the producer of information It specifies access control policies The Publisher is responsible for managing (a portion of) the Owner information and answering subject queries Goal: Untrusted Publisher with respect to Authenticity and Completeness checking XML Source Credential base policy base SE-XML Owner Publisher Reply document credentials Query User/Subject
XML Databases Data is presented as XML documents Query language: XML-QL Query optimization Managing transactions on XML documents Metadata management: XML schemas/DTDs Access methods and index strategies XML security and integrity management
Inference/Privacy Control Interface to the Semantic Web Technology By UTD Inference Engine/ Rules Processor Policies Ontologies Rules XML Documents Web Pages, Databases XML Database
RDF Policy Specification <rdf: RDF xmlns: rdf = “http://w3c.org/1999/02-22-rdf-syntax-ns#” xmlns: xsd = “http:// - - - xmlns: uni = “http:// - - - - <rdf: Description: rdf: about = “949352” <uni: name = Berners Lee</uni:name> <uni: title> Professor < uni:title> Level = L1 </rdf: Description> <rdf: Description rdf: about: “ZZZ” < uni: bookname> semantic web <uni:bookname> < uni: authoredby: Berners Lee <uni:authoredby> Level = L2 </rdf: RDF>
RDF Schema Need RDF Schema to specify statements such as professor is a subclass of academic staff <rdfs: Class rdf: ID = “professor” <rdfs: comment> The class of Professors All professors are Academic Staff Members. <rdfs: subClassof rdf: resource = “academicStaffMember”/> <rdfs: Class>
RDF Schema: Security Policies How can security policies be specified? <rdfs: Class rdf: ID = “professor” <rdfs: comment> The class of Professors All professors are Academic Staff Members. <rdfs: subClassof rdf: resource = “academicStaffMember”/> Level = L <rdfs: Class>
RDF Inferencing While first order logic provides a proof system, it will be computationally infeasible As a result horn clause logic was developed for logic programming; this is still computationally expensive RDF uses If then Rules IF E contains the triples (?u, rdfs: subClassof, ?v) and (?v, rdfs: subClassof ?w) THEN E also contains the triple (?u, rdfs: subClassOf, ?w) That is, if u is a subclass of v, and v is a subclass of w, then u is a subclass of w
RDF Query One can query RDF using XML, but this will be very difficult as RDF is much richer than XML Is there an analogy between say XQuery and a query language for RDF? RQL – an SQL-like language has been developed for RDF Select from “RDF document” where some “condition”
Policies in RDF How can policies be specified? Should policies be specified as shown in the examples, extensions to RDF syntax? Should policies be specified as RDF documents? Is there an analogy to XPath expressions for RDF policies? <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = “//Patent[@Dept = ‘CS’]//Node()” priv = “VIEW”/>
Security and Ontology Ontologies used to specify security policies Example: OWL to specify security policies Choice between XML, RDF, OWL, Rules ML, etc. Security for Ontologies Access control on Ontologies Give access to certain parts of the Ontology
Policies in OWL How can policies be specified? Should policies be specified as shown in the examples, extensions to OWL syntax? Should policies be specified as OWL documents? Is there an analogy to XPath expressions for OWL policies? <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = “//Patent[@Dept = ‘CS’]//Node()” priv = “VIEW”/>
Policies in OWL: Example < owl: Class rdf: about = “#associateProfessor”> <owl: disjointWith rdf: resource “#professor”/> <owl: disjointWith rdf: resource = #assistantProfessor”/> Level = L1 </owl:Class> <owl: Class rdf: ID = “faculty”> <owl: equivalentClass rdf: resource = “academicStaffMember”/> Level = L2 </owl: Class>
Logic and Inference First order predicate logic High level language to express knowledge Well understood semantics Logical consequence - inference Proof systems exist Sound and complete OWL is based on a subset of logic – descriptive logic
Policies in RuleML <fact> <atom> <predicate>p</predicate> <term> <const>a</const> Level = L </fact>
Common Threads and Challenges Building Ontologies for Semantics XML for Syntax Challenges Scalability, Resolvability Security policy specification, Securing the documents and ontologies Developing applications for secure semantic web technologies Automated tools for ontology management Creating, maintaining, evolving and querying ontologies