Dr. Bhavani Thuraisingham September 2006 Building Trustworthy Semantic Webs Lecture #5 ] XML and XML Security
5-2 12/11/ :04 Objective of the Unit 0 This unit will provide an overview of XML and then discuss some security issues
5-3 12/11/ :04 Outline of the Unit 0 XML Elements 0 XML Attributes 0 XML DTD 0 XML Schema 0 XML Namespaces 0 Federations 0 Policy/Credential 0 Access Control 0 Third Party Publication 0 XML Databases 0 Inference Control
5-4 12/11/ :04 What is XML all about? 0 XML is needed due to the limitations of HTML and complexities of SGML 0 It is an extensible markup language specified by the W3C (World Wide Web Consortium) 0 Designed to make the interchange of structured documents over the Internet easier 0 Key to XML used to be Document Type Definitions (DTDs) -Defines the role of each element of text in a formal model 0 XML schemas have now become critical to specify the structure -XML schemas are also XML documents
5-5 12/11/ :04 XML Elements XML Statement John Smith is a Professor in Texas This can be expressed as follows: John Smith Texas
5-6 12/11/ :04 XML Elements Now suppose this data can be read by anyone then we can augment the XML statement by an additional element called access as follows. John Smith Texas All, Read
5-7 12/11/ :04 XML Elements If only HR can update this XML statement, then we have the following: John Smith Texas HR department, Write
5-8 12/11/ :04 XML Elements We may not wish for everyone to know that John Smith is a professor, but we can give out the information that this professor is in Texas. This can be expressed as: John Smith, Govt-official, Read Texas, All, Read HR department, Write
5-9 12/11/ :04 XML Attributes Suppose we want to specify to access based on attribute values. One way to specify such access is given below. <Professor Name = “John Smith”, Access = All, Read Salary = “60K”, Access = Administrator, Read, Write Department = “Security” Access = All, Read </Professor Here we assume that everyone can read the name John Smith and Department Security. But only the administrator can read and write the salary attribute.
/11/ :04 XML DTD DTDs essentially specify the structure of XML documents. Consider the following DTD for Professor with elements Name and State. This will be specified as:
/11/ :04 XML Schema While DTDs were the early attempts to specify structure for XML documents, XML schemas are far more elegant to specify structures. Unlike DTDs XML schemas essentially use the XML syntax for specification. Consider the following example:
/11/ :04 XML Namespaces Namespaces are used for DISAMBIGUATION <CountryX: Academic-Institution Xmlns: CountryX = DTD” Xmlns: USA = “ DTD” Xmlns: UK = “ DTD” <USA: Title = College USA: Name = “University of Texas at Dallas” USA: State = Texas” <UK: Title = University UK: Name = “Cambridge University” UK: State = Cambs
/11/ :04 XML Namespaces <Country: Academic-Institution Xmlns: CountryX = DTD” Xmlns: USA = “ DTD” Xmlns: UK = “ DTD” <USA: Title = College USA: Name = “University of Texas at Dallas” USA: State = Texas” <UK: Title = University UK: Name = “Cambridge University” UK: State = Cambs
/11/ :04 Federations/Distribution Site 1 document: 111 John Smith Texas Site 2 document: K
/11/ :04 XML Query 0 XML-QL, XQuery, etc. are query languages for XML 0 XPath is used for query specification
/11/ :04 Presentations of XML Documents 0 XSLT
/11/ :04 Credentials in XML Alice Brown University of X CS Security John James University of X CS Senior
/11/ :04 Policies in XML <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = = ‘CS’]//Node()” priv = “VIEW”/> <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = = ‘EE’] /Short-descr/Node() and //Patent = ‘EE’]/authors” priv = “VIEW”/> <policy-spec cred-expr = Explantaion: CS professors are entitled to access all the patents of their department. They are entitled to see only the short descriptions and authors of patents of the EE department
/11/ :04 Access Control Strategy 0 Subjects request access to XML documents under two modes: Browsing and authoring -With browsing access subject can read/navigate documents -Authoring access is needed to modify, delete, append documents 0 Access control module checks the policy based and applies policy specs 0 Views of the document are created based on credentials and policy specs 0 In case of conflict, least access privilege rule is enforced 0 Works for Push/Pull modes
/11/ :04 System Architecture for Access Control User Pull/Query Push/result XML Documents X-AccessX-Admin Admin Tools Policy base Credential base
/11/ :04 Third-Party Architecture Credential base policy base XML Source User/Subject Owner Publisher Query Reply document SE-XML credentials 0 The Owner is the producer of information It specifies access control policies 0 The Publisher is responsible for managing (a portion of) the Owner information and answering subject queries 0 Goal: Untrusted Publisher with respect to Authenticity and Completeness checking
/11/ :04 XML Databases 0 Data is presented as XML documents 0 Query language: XML-QL 0 Query optimization 0 Managing transactions on XML documents 0 Metadata management: XML schemas/DTDs 0 Access methods and index strategies 0 XML security and integrity management
/11/ :04 Inference/Privacy Control Policies Ontologies Rules XML Database XML Documents Web Pages, Databases Inference Engine/ Rules Processor Interface to the Semantic Web Technology By UTD
/11/ :04 Example Policies 0 Temporal Access Control -After 1/1/05, only doctors have access to medical records 0 Role-based Access Control -Manager has access to salary information -Project leader has access to project budgets, but he does not have access to salary information -What happens is the manager is also the project leader? 0 Positive and Negative Authorizations -John has write access to EMP -John does not have read access to DEPT -John does not have write access to Salary attribute in EMP -How are conflicts resolved?
/11/ :04 Privacy Policies 0 Privacy constraints processing -Simple Constraint: an attribute of a document is private -Content-based constraint: If document contains information about X, then it is private -Association-based Constraint: Two or more documents taken together is private; individually each document is public -Release constraint: After X is released Y becomes private 0 Augment a database system with a privacy controller for constraint processing
/11/ :04 Summary and Directions 0 XML is widely used 0 Securing XML documents is a challenges 0 How can we specify the policies discussed in this unit in XML? 0 How can query modification be carried out for XML documents? 0 Design access control for XML databases