MIS5101: Extract, Transform, Load (ETL)

Slides:



Advertisements
Similar presentations
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
Advertisements

What is a data warehouse and why would you want one? Emily
Data Integration Combining data from different sources, providing a unified view of the data Combining data from different sources, providing a unified.
Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
© 2003, Prentice-Hall Chapter Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Agenda 02/20/2014 Complete data warehouse design exercise Finish reconciled data warehouse, bus matrix and data mart Display each group’s work Discuss.
Agenda 02/21/2013 Discuss exercise Answer questions in task #1 Put up your sample databases for tasks #2 and #3 Define ETL in more depth by the activities.
MIS2502: Data Analytics Extract, Transform, Load
The Business Intelligence Side of Blue Mountain RAM Bill Lucas, IT Systems Architect and Senior Software Engineer.
Agenda 03/27/2014 Review first test. Discuss internal data project. Review characteristics of data quality. Types of data. Data quality. Data governance.
© 2007 by Prentice Hall 1 Introduction to databases.
GETTING THE DATA INTO THE WAREHOUSE: EXTRACT, TRANSFORM, LOAD MIS2502 Data Analytics.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 10: The Data Warehouse Decision Support Systems in the 21 st.
CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Part Two: - The use of views. 1. Topics What is a View? Why Views are useful in Data Warehousing? Understand Materialised Views Understand View Maintenance.
BUS1MIS Management Information Systems Semester 1, 2012 Week 6 Lecture 1.
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Fall CIS 764 Database Systems Design L18.3 Business Intelligence Aspects (aka Decision support systems) (Slides support.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Data Quality Class 3. Goals Dimensions of Data Quality Data Extraction, Transformation, and Loading Data Cleansing Project.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology- Khan younis.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Advanced Database Concepts
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Carnegie Mellon University © Robert T. Monroe Management Information Systems Data Warehousing Management Information Systems Robert.
 Data Mining Mickey Schaefer, CFO Sutton County Hospital District.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Copyright © 2014 Pearson Canada Inc. 5-1 Copyright © 2014 Pearson Canada Inc. Application Extension 5a Database Design Part 2: Using Information Technology.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
01-Business intelligence
Plan for Populating a DW
Analytics Warehouse P.J. Kelly.
Application Extension 5a
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
Data Warehouse Components
Introduction to Data Warehouse
Business Intelligence
Data storage is growing Future Prediction through historical data
MIS5101: Extract, Transform, Load (ETL)
Introduction to Data Warehousing
Organizational Information – Data Warehouse
المحاضرة 4 : مستودعات البيانات (Data warehouse)
MIS5101: Business Intelligence Access versus Accuracy
MIS5101: Extract, Transform, Load (ETL)
MIS5101: Business Intelligence Access versus Accuracy
MIS2502: Data Analytics Extract, Transform, Load
DATA MINING.
MIS2502: Data Analytics The Information Architecture of an Organization Acknowledgement: David Schuff.
MIS2502: Data Analytics The Information Architecture of an Organization Aaron Zhi Cheng Acknowledgement:
MIS2502: Data Analytics Extract, Transform, Load
MIS2502: Data Analytics Extract, Transform, Load
Metadata The metadata contains
Data Warehousing Concepts
MIS2502: Data Analytics Data Extract, Transform, Load(ETL)
Best Practices in Higher Education Student Data Warehousing Forum
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke
Data Warehousing & DATA MINING (SE-409) Lecture-1 Introduction and Background Huma Ayub Software Engineering department University of Engineering and Technology,
Presentation transcript:

MIS5101: Extract, Transform, Load (ETL)

Discuss (5 minutes) Based on the readings… Why are we drowning in data? Why the process of ETL necessary? What is the “single version of the truth?”

Why are we “drowning in data?” According to the article? Technological changes? Why are we “drowning in data?”

Evaluating the tradeoff vs value(Daccess) value(Daccuracy) How much does it cost? How much do you save? How much do your outcomes improve? How much is an incremental improvement worth? …and the relationships are probably non-linear

Extract, Transform, Load - ETL Copying data from the transactional database to a format where it can be analyzed Selecting and resolving inconsistencies in data to fill the analytical data store

ETL Defined in a “relational” world from various databases across the organization Extract it into a consistent, analysis-ready format Transform it into an “analytical” data store, where large-scale analysis is performed Load

ETL Defined in a “relational” world Extract Transform Load Real-time Database 1 Query Data conversion Query Data Warehouse (Analytical Data Store) On-Demand Reporting Real-time Database 2 Data conversion Query Query

Main ETL Issues: Conversion Stage What if the data is in different formats? Data Consistency How do we know it’s correct? What if there is missing data? What if the data we need isn’t there? Data Quality

Give examples of data inconsistences in retail in healthcare in finance How do you resolve them?

Conflicts abound… Why might there be resistance to this type of aggregation? Is it an option to just “fix” the transactional (source) databases? If two data elements conflict, who’s standard “wins?”