Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehousing HOWTO ● What is a Data Warehouse? ● The organisational imperitive? ● How to build a data warehouse? – Evan Leybourn – Director – Looking.

Similar presentations


Presentation on theme: "Data Warehousing HOWTO ● What is a Data Warehouse? ● The organisational imperitive? ● How to build a data warehouse? – Evan Leybourn – Director – Looking."— Presentation transcript:

1 Data Warehousing HOWTO ● What is a Data Warehouse? ● The organisational imperitive? ● How to build a data warehouse? – Evan Leybourn – Director – Looking Glass Solutions – Not a sales pitch! (Talk to me after for that)

2 Data Warehousing? ● Repository of Organisational Information ● Data from disparate sources is stored for – Reporting – Decision making – Business Intelligence

3 Warehouse Components ● Analysis and Reverse Engineering ● Design of the Consolidation DB and Data Marts ● Extraction and Transformation of the data ● Business level reporting on the data

4 Data ● Historical record of all transactions ● Not a transactional system. ● Turns data into information

5 What's Available ● Minimal footprint in the FOSS space. ● Business Objects and Oracle

6 Process 1 Analysis

7 Data vs Information ● Data: The raw content of the data warehouse ● Information: The (delicious) output as processed by an intelligence tool. z

8 Organisation Requirements ● Meaningful reporting ● Data Integrity checking ● Data ownership

9

10 Data Analysis ● Understand your data sources ● Understand the database relationships ● Understand the content both public and private

11 Data Access ● Database schema and direct or ODBC connection. – Best option. ● Reverse engineering from an ODBC connection. – Acceptable but time consuming ● Reverse engineering from a database dump. – Sometimes your only choice, can be slow.

12 Open Warehouse Project ● Currently titled 'Golf' ● Analysis function ● Automated reverse engineering tools ● Inbuilt data dictionary system

13 Process 2: Design

14 Databases ● A data warehouse is made up of numerous databases. ● Why PostgreSQL – Open Source – Scales to enormous data sets – Complies with SQL standard – Supports triggers and functions (perl/python). – Excellent indexing

15 Consolidation Database ● A database which contains all data from all sources. ● Should contain historical information. ● Denormalised Schema.

16 Schema

17 Golf ● Predefined plperl triggers. ● Check and insert incremental data. (Inserts and Updates) ● Fill timestamp fields ● Enforce pseudo-foreign keys

18 Data Marts ● Subsets of the consolidation database ● Used for reporting and business intelligence ● Orders of magnitude faster to query ● Same requirements as the consolidation database.

19 Process 3 Extraction

20 ● Extract -> Transform -> Insert

21 Transformation Reasons ● Validity checks ● Standardisation ● Integration

22 Transformation Types ● Join fields ● Split fields (by regex) ● Modify content (by regex) ● Drop field ● Insert arbitary field ● Drop row

23 Process 4 Reporting

24 ● Tabular reporting ● Web Services ● Dashboarding

25 Thank You ● Questions?


Download ppt "Data Warehousing HOWTO ● What is a Data Warehouse? ● The organisational imperitive? ● How to build a data warehouse? – Evan Leybourn – Director – Looking."

Similar presentations


Ads by Google