City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Technical BI Project Lifecycle
Management Information Systems, Sixth Edition
Data Warehousing M R BRAHMAM.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Guide to Oracle10G1 Introduction To Forms Builder Chapter 5.
Chapter 3 Database Management
Database Management: Getting Data Together Chapter 14.
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
3-1 Chapter 3 Data and Knowledge Management
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
5 Copyright © 2009, Oracle. All rights reserved. Defining ETL Mappings for Staging Data.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
Databases Creating databases to store information.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Databases & Data Warehouses Chapter 3 Database Processing.
ETL By Dr. Gabriel.
MS Access: Database Concepts Instructor: Vicki Weidler.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Classroom User Training June 29, 2005 Presented by:
Systems analysis and design, 6th edition Dennis, wixom, and roth
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Web-Enabled Decision Support Systems
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
Data-mining & Data As we used Excel that has capability to analyze data to find important information, the data-mining helps us to extract information.
Management Information Systems By Effy Oz & Andy Jones
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
311: Management Information Systems Database Systems Chapter 3.
1 Client/Server Databases and the Oracle Relational Database.
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Management Console Synonym Editor
Fundamentals of Information Systems, Seventh Edition 1 Chapter 3 Data Centers, and Business Intelligence.
1 Oracle Warehouse Builder Click by Click February 8, 2007 Jim Raper Data Administration Manager BSS/IT City of Charlotte
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
6 Copyright © 2009, Oracle. All rights reserved. Using the Data Transformation Operators.
Lesson 2: Designing a Database and Creating Tables.
7 Strategies for Extracting, Transforming, and Loading.
9 Copyright © 2009, Oracle. All rights reserved. Deploying and Reporting on ETL Jobs.
Object storage and object interoperability
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Physical Layer of a Repository. March 6, 2009 Agenda – What is a Repository? –What is meant by Physical Layer? –Data Source, Connection Pool, Tables and.
uses of DB systems DB environment DB structure Codd’s rules current common RDBMs implementations.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Building Enterprise Applications Using Visual Studio®
Fundamentals of Information Systems, Sixth Edition
Overview of MDM Site Hub
Introduction.
Data Warehouse.
Big Data The huge amount of data being collected and stored about individuals, items, and activities and to the process of drawing useful information from.
MANAGING DATA RESOURCES
Data warehouse.
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock by Doris Phillips 6010 Data Integration UNC Charlotte

Business Decision Making Which? (Raper, J., Building a Data Warehouse in a Heterogeneous Tool Environment)

Overview Describe current data warehousing/business intelligence methods at the City of Charlotte Present Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock Compare data warehousing techniques to data integration methods for mashups 3

Focus on Processes Common to Data Warehousing and Data Integration Data Retrieval Data Modeling Data Cleaning Data Integration Common processes but not the same! 4

Data Warehousing at the City of Charlotte 5

DISCLAIMER The views and opinions presented in this paper are solely those of the Author and do not necessarily reflect those of Business Support Services Information Technology Division or of the City of Charlotte. This material is provided for informational purposes only. City of Charlotte assumes no responsibility for accuracy of the information in this paper or from damages caused by implemented the techniques or methodologies presented herein.

City of Charlotte Simplified KBU Organization Chart (Raper, J., Building a Data Warehouse in a Heterogeneous Tool Environment)

Sample Data Sources Accounts Payable Accounts Receivable Asset Center Emerald Faster General Ledger Kronos Hansen PeopleSoft Remedy Unisys Helpdesk Utility Billing System

HUB and Spoke Design DB2 DB MS/ SS DB ORA DB FILES UTL ORA MART SWS MS/SS MART FIN MS/SS MART OWB DTS SOURCES TARGET DATA MARTS DV ODS DATA VAULT Oracle ETL OWB (Raper, J., Building a Data Warehouse in a Heterogeneous Tool Environment)

10 Data Vault Components Hubs - Key Topics Links – Relationships Satellites – Details Auxiliaries – Support Implemented as Relational Tables Diagramed Using Distinctive Shapes (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

11 Hub Tables Hub records contain the logical business and physical keys to the business data and its context. Relatively Stable over Time Keys – Primary Key – Surrogate ID – Natural Key – Unique Logical Key Fields (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

12 Link Tables Link records provide information related to relationships between Hubs and Links. The information in link records can and does change over time. May have one or more Satellite Type records Keys – Primary Key – Surrogate ID – Natural Key – Unique Logical Key Fields – Foreign Key – HUB or LNK SIDs (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

13 Satellite Tables Satellite records provide the structure to hold the context or descriptive type information from operational systems. Maintain these changes over time. Related Directly to Hub or Link Records Keys – Primary Key – Surrogate ID – Foreign Key – HUB or LNK SIDs (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

14 Auxiliary Tables Auxiliary records contain a variety of cross reference and lookup descriptions tied to logical business keys. Standalone Support Tables Not Directly Linked to HUB, SAT, or LNK tables. Types of Auxiliaries – Lookups – Cross References – External Tables – ETL Work Tables Keys – Primary Key is NK – Unique Logical Key Fields – May have FK (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

15 Generic Data Vault Schema (Raper, J., Phillips, D., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2)

16 Oracle Warehouse Builder Mapping New Code Capture (Raper, J., Phillips, D., User-Managed Metadata…)

17 Oracle Warehouse Builder Mapping New Code Capture Detail (Raper, J., Phillips, D., User-Managed Metadata…)

Data Cleaning Convert to UPPER case Trim blank spaces Replace NULL within UNK or UNKNOWN Check for valid values using lookups Remove duplicates 18

19 Metadata Capture - Cross Reference Organization Codes APEX Application (Raper, J., Phillips, D., User-Managed Metadata…)

Data Warehouse Approach Extracted data from multiple heterogeneous sources Converted to Data Vault Architecture Cleaned data and transformed into desired format Combined data from multiple sources Data provided to users for reporting and data visualizations 20

Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock 21

Overview Mashup: A web application that integrates data from multiple web sources to provide a unique service Goal: Create a mashup building framework where an average Internet user with no programming experience can build Mashups easily 22

Current Solutions Widget Paradigm – Current Solutions involve selecting, customizing, and connecting widgets together Disadvantages – As number of widgets gets large, locating the right widget becomes confusing and time consuming – Connecting widgets required understanding programming concepts 23

Widgets – Yahoo Pipes 24 (Tuchinda, et. al, p. 140)

Microsoft Popfly United States Information Widget - EDIT 25 United States Information Widget - RUN

Mashup Building Process Data Retrieval – Extracting data from web pages into a structured source (table or XML) Source Modeling – Process of assigning the attribute name for each data column Data Cleaning – Required to fix misspellings and transform extracted data into the appropriate format Data Integration – Specifies how to combine two or more data sources together 26

Karma Solution 27 “The left window is an embedded web browser. The top right window contains a table that a user would interact with. The lower right window shows options that the user can select to get into different modes of operation.” (Tuchinda, et. al, p. 140)

Data Retrieval – Karma Table User Selects Value from Page, List Automatically Populated User Selects Address for Value, List Automatically Populated 28 (Tuchinda, et. al, p )

Source Modeling - Attributes Karma Attributes Karma compares extracted data with existing data in its repository Automatically populates some attributes User specifying the correct attribute Users search existing attributes in data repository Example 29

Data Cleaning Users specify what data to clean Karma tries to pick up desired transformations and populate the remaining columns 30

Data Integration Karma Analyzes attributes and data to determine possible join conditions Suggests existing data sources in the repository that can be linked to the new data in the table 31

Data Integration Karma Problems Locating the related sources from the repository Figuring out the query to combine the new source and existing valid sources 32

Data Integration Karma Solution Uses table constraints Uses programming methods and procedures induced from user interaction 33

Mashup Building Approach Combines most problem areas in Mashup building into a unified interactive framework that requires no widgets Allows users with no programming background to easily create Mashups by example 34

Conclusions Data Warehouse Data from enterprise applications and maintained sources Historical data for trending and analysis Extract, Transform, and Load Often Strategic information Data Integration / Mashup Data from web that may not be maintained or may contain errors Real time data for current information Extract, Model, and Clean Often tactical information 35

Resources Linstedt, D. Nov. 9, Linstedt, D., Garziano, K., Hultgren, H., May The New Business Supermodel: The Business of Data Vault Modeling. Raper, Jim., Dec Building a Data Warehouse in a Heterogeneous Tool Environment. Raper, Jim., Dec From Source to Loading Dock with Oracle Warehouse Builder Raper, Jim., Phillips, Doris., User-Managed Metadata: Oracle Application Express Meets Oracle Warehouse Builder 10.2 Tuchinda, R., Szekely, Pl., Knoblock, C., ACM., Building Mashups by Example. 36