Security in semantic web Hassan Abolhassani, Leila Sharif Sharif university of technology
Outline ● Semantic Web: a short introduction ● Security model in html document based web ● Security issues in a sample semantic web ● Analysis of solutions ● Query reformulation: centralized version ● Query reformulation: distributed version ● Security aware inference engine
Semantic web ● Bringing meaning to the web ● Overcome the limitation of current web – Machine processing is not possible – Search engines returns a lot of un-related results ● Impossible using current web: – Find information about animals that use sonar but are not either bats or dolpines – Finding (best) prices of goods and services – Delegating task to agents: Book me a holiday next weekend somewhere warm not too far away
Syntatic web
Semantic web layers
Focus of this work
Outline ● Semantic Web: a short introduction ● Security model in html document based web ● Security issues in a sample semantic web ● Analysis of solutions ● Query reformulation: centralized version ● Query reformulation: distributed version ● Security aware inference engine
Simplified security model of the current web ● A page as a whole is subject to security ● All the concepts in a page is treated equally ● Searches return references to pages, not to concepts ● This model is not applicable to semantic web
Outline ● Semantic Web: a short introduction ● Security model in html document based web ● Security issues in a sample semantic web ● Analysis of solutions ● Query reformulation: centralized version ● Query reformulation: distributed version ● Building security aware inference engine
A sample semantic web ● company1 has name1 as its name ● person1 is the president of the company1 ● this person has phone1 as his personal phone and phone2 as his office phone ● the company has partner1 as one of its partners ● partner1 has product1 with name1 and price1 as its name and price respectively ●...
A sample semantic web (cont.)
Example queries (in OWL/QL) ● Finding the “personalPhone” number of the president of “company1”: Query: (“What is the personalPhone of president of company1”) Query Pattern: {(c:president company1 ?person) (p:personalPhone ?person ?phone)} Must-Bind Variables List: (?phone) May-Bind variables List: () Don't-Bind Variables List: () Answer Pattern: {(p:personalPhone “president of company1” ?phone)}
Example queries (in OWL/QL) ● Finding the “personalPhone” number of the president of “company1”: Query: (“What is the personalPhone of president of company1”) Query Pattern: {(c:president company1 ?person) (p:personalPhone ?person ?phone)} Must-Bind Variables List: (?phone) May-Bind variables List: () Don't-Bind Variables List: () Answer Pattern: {(p:personalPhone “president of company1” ?phone)} Is anybody's access to personal phone number of president of company ok?
Example queries (in OWL/QL) ● Finding a provider company that provides product1 Query: (“What partner provides product1”) Query Pattern: {(c:partner company1 ?partner) (c:product ?partner prd:product1)} Must-Bind Variables List: (?partner) May-Bind variables List: () Don't-Bind Variables List: () Answer Pattern: {(prd:prduct ?partner prd:product1)}
Example queries (in OWL/QL) ● Finding a provider company that provides product1 Query: (“What partner provides product1”) Query Pattern: {(c:partner company1 ?partner) (c:product ?partner prd:product1)} Must-Bind Variables List: (?partner) May-Bind variables List: () Don't-Bind Variables List: () Answer Pattern: {(prd:prduct ?partner prd:product1)} Is anybody's access to partner information ok?
Differences between traditional web and semantic web security ● Concepts are linked not web pages ● Query instead of search ● A query processor traverses a semantic web graph ● For each node in the graph a different access previlage may be assigned ● We don't want to repeat current web limitations ● Each different site has its own logon facility ● Single sign-on is introduced to solve this ● How to add security to SW?
Outline ● Semantic Web: a short introduction ● Security model in html document based web ● Security issues in a sample semantic web ● Analysis of solutions ● Query reformulation: centralized version ● Query reformulation: distributed version ● Building security aware inference engine
Ad-hoc solution ● Create several semantic webs: ● Separate non-public and public information ● This works but is not a general solution: ● Results in redundant information: creation, maintenamce, and other problems. ● It is not feasible when we have a distributed model (query processor should know about all security measures of involving sites)
Query reformulation (filtering) ● Using OWL/QL features of variable binding
Query reformulation (cont.) ● Merits: ● Easy to implement ● Has not a significant overhead on query processing ● Demerits ● Not applicable when a semantic model is distributed
Distributed query reformulation Query pattern Query processor Site1 filtering agent Site2 filtering agent SiteN filtering agent Query pattern Filtered Query pattern
Distributed query reformulation (cont.) ● Merits: ● No centeralized control on security is needed ● Applicable to semantic webs that are distributed ● Demerits ● Overhead of pre-prossesing ● The need for filtering agents at each site ● May introduce security holes (should be investigated further)
Security aware inference engines ● It is clear by now that the general solution is to add security at the level of inference engines ● To have a security model we need to have a formalism ● The basic formal model for SW is considered to be Description logic: ● A variable-free logic formalism ● A deciedable fragment of first-order logic ● All constructs are convertible to first-order logic unary and binary predicate
Security aware inference engines (cont.) ● Basic description logic AL (attributive language) description logic: Sample statements:
Security aware inference engines (cont.) ● Inference in Description Logic ● Tableau based reasoning algorithms has been developed ● This algorithms work based on expansion (completion) rules ● A tree is expanded starting from the original statement (i.e. Query) ● Algorithm stops when a clash appears (i.e. C and ~C in the same node)
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Example of inference
Security aware inference engines (cont.) ● Expansion rules for ALC
Security aware inference engines (cont.) ● Adding security semantics to expansion rules
Security aware inference engines (cont.) ● Security added tableau algorithm ● Output of reasoner depends on the way the processing is terminated: ● In case of a clash the output can be something like: The query is not answerable by the knowledge base ● In case of a security violation: You are not allowed to traverse parts of knowledge base needed to respond to your query apply completion rules in arbitrary order as long as possible: - stop in case of clash - stop in case of “security violation” - Terminate if no completion rule is applicable
Security aware inference engines (cont.) ● Merits: ● An algorithm based on a formal language ● Complexity is same as tableau ● Demerits ● ?
Conclusions ● Differences between security model of syntactic web and semantic web is recognized ● Several solutions proposed: ● Ad-hoc: applicable to small closed organizations ● Centeralized filtering: applicable to a small society of organizations ● Distributed filtering: applicable to any society of organizations but with preprocessing overhead ● Security aware inference engines: no limitations upto now is recognized
Thank you