Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Data Warehouses BUAD/American University Data Warehouses.

Similar presentations


Presentation on theme: "1 Data Warehouses BUAD/American University Data Warehouses."— Presentation transcript:

1 1 Data Warehouses BUAD/American University Data Warehouses

2 2 BUAD/American University Definition Data Warehouse: An integrated and consistent store of subject-oriented data that is obtained from a variety of sources and formatted into a meaningful context to support decision-making in an organization.

3 3 Data Warehouses BUAD/American University Need for Data Warehousing Integrated, company-wide view of high- quality information. Separation of operational and informational systems and data –operational system: a system that is used to run a business in real time, based on current data –informational system: systems designed to support decision making based on stable point- in-time or historical data

4 4 Data Warehouses BUAD/American University Factors Allowing Data Warehousing Relational DBMS. Advances in hardware: speed and storage capacity. End-user computing interfaces and tools.

5 5 Data Warehouses BUAD/American University Data Warehouse Architectures Two-level –source system files containing operational data –transformed and integrated data warehouse Three-level –Operational data. –Enterprise data warehouse (EDW)- single source of data for decision making. –Data marts - limited scope; data selected from EDW; customized decision-support for individual user groups

6 6 Data Warehouses BUAD/American University Generic data warehouse architecture

7 7 Data Warehouses BUAD/American University Three-layer architecture

8 8 Data Warehouses BUAD/American University Reasons for the Three-Level Architecture EDW and data marts have different purposes and data architectures. Data transformation is complex and is best performed in two steps. Data marts customized decision support for different groups Architecture –Operational data, reconciled data, Derived data.

9 9 Data Warehouses BUAD/American University Three-layer data architecture

10 10 Data Warehouses BUAD/American University Data Characteristics Status vs. Event data. –A transaction is a business activity that triggers one or more business events: event data captures them Transient vs. Periodic data. –Transient: data in which changes to existing records are written over previous records, thus destroying previous data content –periodic data: data that are never physically altered or deleted once added

11 11 Data Warehouses BUAD/American University Example of DBMS log entry

12 12 Data Warehouses BUAD/American University Transient operational data

13 13 Data Warehouses BUAD/American University Reconciled Data Characteristics Detailed Historical Normalized Enterprise-wide Quality controlled

14 14 Data Warehouses BUAD/American University The Data Reconciliation Process Capture: capture the relevant data from source files to fill EDW –Static - initial load. –Incremental - ongoing update. Scrub or data cleansing –missing data, name reconciliation –Pattern recognition and other artificial intelligence techniques.

15 15 Data Warehouses BUAD/American University Steps in data reconciliation

16 16 Data Warehouses BUAD/American University The Data Reconciliation Process Transform –Convert the data format from the source to the target system. –Record-Level Functions Selection. Joining. Aggregation (for data marts). –Field-Level Functions Single-field transformation Multi-field transformation

17 17 Data Warehouses BUAD/American University The Data Reconciliation Process Load and Index –Refresh Mode When the warehouse is first created. Static data capture. –Update Mode Ongoing update of the warehouse. Incremental data capture.

18 18 Data Warehouses BUAD/American University Derived Data Characteristics Type of data –Detailed, possibly periodic. –Aggregated. Distributed to departmental servers. Implemented in star schema.

19 19 Data Warehouses BUAD/American University Star Schema Also called the dimensional model. Fact and dimension tables. –Fact table: consists of factual or quantitative data about the business –Dimension table: hold descriptive data Grain of a fact table - time period for each record.

20 20 Data Warehouses BUAD/American University Components of a star schema

21 21 Data Warehouses BUAD/American University Star schema example

22 22 Data Warehouses BUAD/American University Star schema with sample data

23 23 Data Warehouses BUAD/American University Example of snowflake sample

24 24 Data Warehouses BUAD/American University Size of the fact table Total number of stores: 1,000 Total number of products: 10,000 Total number of periods: 24 Total rows: 1000 * 10,000 * 24 = 240,000,000 On average 50% items record sales, –no of rows = 120,000,000

25 25 Data Warehouses BUAD/American University Types of Data Marts Dependent - Populated from the EDW. Independent - Data taken directly from the operational databases.

26 26 Data Warehouses BUAD/American University The User Interface The role of metadata. Traditional query and reporting tools. On-line analytical processing (OLAP) The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques.

27 27 Data Warehouses BUAD/American University The User Interface –Slicing a cube. –Pivot Rotate the view for a particular data point to obtain another perspective. E.g. take a value from the units column and obtain by-store values. –Drill-down

28 28 Data Warehouses BUAD/American University Slicing a data cube


Download ppt "1 Data Warehouses BUAD/American University Data Warehouses."

Similar presentations


Ads by Google