“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
RDF Data Management In Oracle10g Jayant Sharma Technical Director, Spatial Oracle Server Technologies
Overview Network Data Model RDF Technology Background RDF Data Model, Query, Rulebases Summary
Managing Spatial Types in Oracle10g Road Networks (directed graph) Zip Codes Boundaries (polygons) Annotation (text) Oracle10g Spatial Data Aerial Imagery (raster) Business Addresses (geocoded points) Structured Topology (parcels)
Network Data Model (NDM) Feature of Oracle Spatial 10g Tool for managing graphs (networks) in the database Supports directed and undirected networks spatial networks logical networks Consists of a network database schema, and a Java API for representation and analysis. Network Data Model (NDM) Oracle Spatial is an option of the Oracle database, and the Network Data Model is one of the many features provided with Oracle Spatial 10g. A network or graph captures relationships between objects using connectivity. NDM supports both directed and undirected networks, which can be either spatial or logical. Spatial networks contain both connectivity information and geometric information. Logical Networks contain connectivity information but no geometric information. NDM consists of two components: a network database schema, and a Java API. The network schema contains network metadata and tables for nodes and links. The Java API enables network representation and network analysis.
Network Schema Network Metadata Name, Type, Node/Link/Path Table Information Network Tables Node Table Node_ID, Node_Type, Geometry,… Link Table Link_ID, Link_Type, Start_Node_ID, End_Node_ID, Cost, Geometry, … Path Table Path_ID, Start_Node_ID, End_Node_ID, Cost, Geometry, Path Links,… Application Information is added to network schema Add additional columns in node, link, and path tables directly Add foreign key(s) to node, link, and path tables NDM consists of 2 components. The first is the network database schema. As you can see the data for NDM is stored in Oracle in tables, like any other data would be. The only difference is that the data must contain the information highlighted above. So there must be a table that identifies the metadata of the Network. There must be a node table, where all information about the nodes is stored, e.g. node_ID, Node_Type, geometry. There must be a link table which contains a link_ID, a link_type, a start_node_id, a cost, and a goemetry. Finally, there must be a path table which contains a path_id, start node ID, end node id, a cost, geometry, and path links. It is possible to store any other information of interest in the the tables too. This information may be specific to the application that you are using to view the data.
Network Data Model APIs PL/SQL Package: (server-side) Network data query and management Maintains referential integrity and validation Supports link/node/path updates Java API: (mid-tier or client side) Network loading/storage Network analysis Network Creation/Editing Java Functional Extensibility Network, Node, Link, Path are Java Interfaces Application-based network analysis extensions Network information is stored and managed (through PL/SQL and SQL) in the relational database, and network representation, network loading, and network analysis are done in the client or application tier using the Java API. The Oracle network data model separates network data and application information, so that applications can focus on specific application knowledge.
Oracle 10g RDF Approach: Extended existing Oracle10g network (graph) data model (NDM) to support RDF object types Support for user-defined rules, rulebases, rules indexes RDF data model with (user-defined) rules to support inferencing Enable combined SQL query of enterprise database and RDF graphs Support large, complex models (10s of millions statements) Easily extensible by 3rd party tools/apps
RDF Graph Types of elements: URIs, Blank Nodes, and Literals Blank Nodes: _:r1 Plain Literals: “John”, “color”@en-us Typed Literals: “16”^^xsd:int, “John”^^xsd:string RDF Triples: <subject predicate object> Subject: URIs or Blank Nodes Predicate: URIs Objects: URIs, Blank Nodes, or Literals A set of RDF triples constitute an RDF graph
Family: Schema :Male :fatherOf :brotherOf :parentOf :Person :siblingOf :motherOf :sisterOf :Female rdfs:subClassOf rdfs:domain rdfs:range rdfs:subPropertyOf
RDF Graph: Example Family: Classes and Properties (:Male rdfs:subClassOf :Person) (:Female rdfs:subClassOf :Person) (:fatherOf rdfs:subPropertyOf :parentOf) (:motherOf rdfs:subPropertyOf :parentOf) (:brotherOf rdfs:subPropertyOf :siblingOf) (:sisterOf rdfs:subPropertyOf :siblingOf)
RDF Graph: Example Family: Domains and Ranges of Properties (:fatherOf rdfs:domain :Male) (:fatherOf rdfs:range :Person) (:motherOf rdfs:domain :Female) (:motherOf rdfs:range :Person) (:brotherOf rdfs:domain :Male) (:brotherOf rdfs:range :Person) (:sisterOf rdfs:domain :Female) (:sisterOf rdfs:range :Person)
Family: Data :John :Janice :Martha :Matt :Sammy :Suzie :Cindy :Tom :Cathy :Jack :Female :Male :fatherOf :motherOf :sisterOf rdf:type
Components Appl. Tables Rulebase 1 Rulebase 2 … Rulebase m A1 Model 1 RDF Query DDL Load DML Rules Index 1 Rules Index 2 A2 Model 2 … … An Model n Rules Index p
Querying RDF data
RDF Querying Problem Given Find Return An RDF dataset (graphs) to be searched A graph-pattern containing a set of variables Find Subgraphs that match the graph-pattern Return Sets of variable bindings each set corresponds to a matching subgraph (substitution in graph-pattern produces subgraph)
RDF Querying Approach SQL-based approach Alternate approach Introduces a SQL table function SDO_RDF_MATCH that accepts RDF queries Benefits Leverage powerful constructs of SQL to process RDF Query results Combine with SQL queries without staging Alternate approach Create new (declarative, SQL-like) languages e.g., RQL, SeRQL, TRIPLE, Versa, SPARQL, RDQL, RDFQL, SquishQL
Embedding RDF Query in SQL SELECT … FROM …, TABLE ( ) t, … WHERE … RDF Query (expressed via SDO_RDF_MATCH invocation)
SDO_RDF_MATCH Table Func Input Parameters SDO_RDF_MATCH ( Query, graph-pattern (with variables) Models, set of RDF models Rulebases, set of rulebases (e.g., RDFS) Aliases, aliases for namespaces Filter additional selection criteria ) Return type in definition is AnyDataSet Actual return type is determined at compile time based on the arguments for each specific invocation
Query Example select m from TABLE(SDO_RDF_MATCH( '(?m rdf:type :Male)', SDO_RDF_Models('family'), null, SDO_RDF_Aliases( SDO_RDF_Alias('', 'http://www.example.org/family/')), null)); M -------------------------------------------------------------------------------- http://www.example.org/family/Jack http://www.example.org/family/Tom Table function returns a single-column table: M
Join with SQL tables: Example Find salary and hiredate of Matt’s grandfather(s) SELECT emp.name, emp.salary, emp.hiredate FROM emp, TABLE(SDO_RDF_MATCH( ‘(?x :fatherOf ?y) (?y :parentOf :Matt) (?x :name ?name)’, SDO_RDF_Models(‘family'), …)) t WHERE emp.name=t.name;
Inference
Rulebase: Overview Each rulebase consists of a set of rules Each rule consists of antecedent: graph-pattern filter condition (optional) consequent: graph-pattern One or more rulebases may be used with relevant RDF models (graphs) to infer new data
Rulebase: Example Oracle supplied, pre-loaded rulebases: e.g., RDFS rdfs:subClassOf is transitive and reflexive Antecedent: ‘(?x rdf:type ?y) (?y rdfs:subClassOf ?c)’ Consequent: ‘(?x rdf:type ?c)’ Antecedent: ‘(?x ?p ?y) (?p rdfs:domain ?c)’ Rules in a rulebase family_rb: Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’ Consequent: ‘(?x :grandParentOf ?z)’
Rules Index: Overview A rules index is created on an RDF dataset (consisting of a set of RDF models and a set of RDF rulebases) A rules index contains RDF triples inferred from the model-rulebase combination
Rules Index: Example A rules index may be created on a dataset consisting of family RDF data, and family_rb rulebase (shown earlier) The rules index will contain inferred triples showing grandParentOf relationship
RDF Query with Inference
Query w/ RDFS+Family Inference select x, y from TABLE(SDO_RDF_MATCH( '(?x :grandParentOf ?y) (?x rdf:type :Male)', SDO_RDF_Models('family'), SDO_RDF_Rulebases('RDFS', 'family_rb'), SDO_RDF_Aliases( SDO_RDF_Alias('','http://www.example.org/family/')), null)); X Y ------------------------------------------------------ ----------------------------------------------------- http://www.example.org/family/John http://www.example.org/family/Cindy http://www.example.org/family/John http://www.example.org/family/Tom http://www.example.org/family/John http://www.example.org/family/Jack http://www.example.org/family/John http://www.example.org/family/Cathy
Some Oracle10g RDF Partners Cerebra Cognia Siderean Tom Sawyer Top Quadrant
Summary Comprehensive, fully integrated into SQL RDF support in Oracle 10g Release 2 Models (Graphs) Rulebases Rules Indexes Query using SDO_RDF_MATCH table function Documentation and White Papers http://www.oracle.com/technology/tech/semantic_technologies/index.html