XCube XML For Data Warehouses By Sven Groot
Data warehouses Contains data drawn from several databases and external sources Contains data drawn from several databases and external sources Provide a comprehensive view of all aspects of an enterprise Provide a comprehensive view of all aspects of an enterprise Complemented by increased emphasis on powerful analysis tools Complemented by increased emphasis on powerful analysis tools –SQL is inadequate –OLAP: OnLine Analytic Processing
Data Warehousing External Data Sources Operational Databases Extract Clean Transform Load Refresh Data Warehouse Metadata repository Serves OLAP Visualisation Data Mining
OLAP Multidimensional data model Multidimensional data model timeid pid locid
OLAP (cont’d) Multidimensional data as a relation Multidimensional data as a relation locidcitystatecountry 1AmesIowaUSA 2LeidenZHHolland 3TempeArizonaUSA pidpnamecategoryprice11 Lee Jeans Apparel25 12X-BoxElectronics Biro Pen Stationery2 pidtimeidlocidsales Locations Products Sales
OLAP (cont’d) Dimension as hierarchies Dimension as hierarchies PRODUCTTIMELOCATION category pname year quarter weekmonth date country state city
OLAP (cont’d) Typical OLAP queries Typical OLAP queries –Find the total sales –Find total sales for each city –Find total sales for each state –Find the top five products ranked by total sales Possible to drill-down and roll-up on dimensions Possible to drill-down and roll-up on dimensions Pivoting Pivoting
eXtensible Markup Language Contains nodes that may be processing instructions, elements, attributes, CDATA sections or comments. Contains nodes that may be processing instructions, elements, attributes, CDATA sections or comments. Must be well-formed Must be well-formed Format can be defined by a DTD or XSD. Format can be defined by a DTD or XSD. Multiple formats in one document using namespaces. Multiple formats in one document using namespaces. Can be transformed using XSLT Can be transformed using XSLT Second Edition Second Edition </Library>
Data Warehouses Reloaded Data warehousing occurs across departments all over the globe, and also across companies Data warehousing occurs across departments all over the globe, and also across companies External datasources might include WWW and other data warehouses External datasources might include WWW and other data warehouses One flexible format for exchanging data cubes would be useful: XCube One flexible format for exchanging data cubes would be useful: XCube
XCube Scenarios Download Download
XCube Scenarios (cont’d) Query Query
XCube Scenarios (cont’d) Generating Generating –Conversion of any data into data cube –Using data from a warehouse in data cube
Requirements for online cubes Support for multidimensional data model. Support for multidimensional data model. Support for conceptual distinction between schema, dimension and fact data. Support for conceptual distinction between schema, dimension and fact data. Transportable over the network. Transportable over the network. For flexibility and reuse linking and inclusion concepts needed For flexibility and reuse linking and inclusion concepts needed Extensible to adapt to different data models or new concepts Extensible to adapt to different data models or new concepts Easily convertible to and from various sources and formats Easily convertible to and from various sources and formats Possibly allow OLAP processing to reduce data transfer Possibly allow OLAP processing to reduce data transfer
XCube formats XCubeSchema XCubeSchema
XCube formats (cont’d) <multidimensionalSchema version="0.4" xmlns=" xmlns="
XCube formats (cont’d) XCubeDimension XCubeDimension
XCube formats (cont’d) <dimensionData version="0.4" xmlns=" xmlns="
XCube formats (cont’d) XCubeFact XCubeFact
XCube extended formats XCubeText XCubeText –Adds textual description for nearly every element. –Future version will allow separate files. –Allows different levels of detail (short, medium, long, html)
XCube extended formats (cont’d) XCubeQuery XCubeQuery –Organise interactive dialog between client and server –Meant to facilitate more efficient exchange of data –Consists of seven different query formats
XCubeQuery List of available cubes List of available cubes –Request: –Request: –Response: –Response:
XCubeQuery (cont’d) Getting the schema of a special cube Getting the schema of a special cube –Request: –Request: –Response: –Response:
XCubeQuery (cont’d) Querying the Classification Schema Querying the Classification Schema –Request: –Request: –Response:
XCubeQuery (cont’d)
XCubeQuery (cont’d) Querying Classification Nodes Querying Classification Nodes –Request: –Request: –Response: –Response:
XCube extended formats (cont’d) XCubeFunction XCubeFunction –Still under development –Query XCube server about it’s functionality
XCube formats summary XCubeSchemaXCubeDimensionXCubeFactXCubeTextXCubeQueryXCubeFunction
Related work Common Warehouse Metamodel Common Warehouse Metamodel MetaCube-X MetaCube-X XML for Analysis XML for Analysis
Where from here Basis for more complex and efficient infrastructure. Basis for more complex and efficient infrastructure. Combination with XML Web Services Combination with XML Web Services Evolution of XCubeText Evolution of XCubeText Create new data warehouses with XCube standards. Create new data warehouses with XCube standards.
References Wolfgang Hümmer, Andreas Bauer & Gunnar Hard; XCube – XML For Data Warehouses; DOLAP’03, November 7, Wolfgang Hümmer, Andreas Bauer & Gunnar Hard; XCube – XML For Data Warehouses; DOLAP’03, November 7, Raghu Ramakrishnan & Johannes Gehrke; Database Management Systems, second edition; McGraw-Hill, 2000 Raghu Ramakrishnan & Johannes Gehrke; Database Management Systems, second edition; McGraw-Hill, 2000 T. Bray, J. Paoli, C.M. Sperberg-McQueen; E. Maler; Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation 6 October T. Bray, J. Paoli, C.M. Sperberg-McQueen; E. Maler; Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation 6 October