Data Transformation for Analysis Purposes Presented By: Gregg Ravenscroft Khulisa Management Services

Slides:



Advertisements
Similar presentations
April, 2004 Lars Thygesen International Trade Expert meeting Whats going on at OECD: statistical information management.
Advertisements

Copyright © Intel Corporation Third-party brands and names are the property of their respective owners. 1 Intel Library
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
C6 Databases.
1 Topic 6 Processing Form Input. 2Outline Goals and Objectives Goals and Objectives Chapter Headlines Chapter Headlines Introduction Introduction Form.
Tools You Own Maggie Moehringer AIRPO, June 2006.
BUSINESS DRIVEN TECHNOLOGY
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Managing Master Data with MDS and Microsoft Excel
Lecture-8/ T. Nouf Almujally
MDS enables users to curate Sets of Objects. This capability is powerful in a wide variety of scenarios across all organization levels.
Troy Eversen | 19 May 2015 Data Integrity Workshop.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
Data Warehouse Tools and Technologies - ETL
AICT5 – eProject Project Planning for ICT. Process Centre receives Scenario Group Work Scenario on website in October Assessment Window Individual Work.
Commercial Database Applications Testing. Test Plan Testing Strategy Testing Planning Testing Design (covered in other modules) Unit Testing (covered.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
From Conformance to Performance: Using Integrated Risk Management to achieve Organisational Health Ms Stacie Hall Comcover National Manager.
Table-Driven Acceptance Testing Mario Aquino Principal Software Engineer Object Computing, Inc.
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Unit 18: Database Modelling
Database Design - Lecture 1
MAHI Research Database Data Validation System Software Prototype Demonstration September 18, 2001
24 GOLDEN COINS, 1 IS FAKE ( WEIGHS LESS). DATABASE CONCEPTS Ahmad, Mohammad J. CS 101.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
California’s Surface Water Ambient Monitoring Program Data Management Systems Cassandra Lamerdin SWAMP Data Management Team Marine Pollution Studies Laboratory.
South Africa Data Warehouse for PEPFAR Presented by: Michael Ogawa Khulisa Management Services
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Databases and Information Systems.
Design Patterns 1 FME UC 2007 Design Patterns FME Workbench.
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
Business Intelligence Software Business Intelligence A Software at your size.
Storing Organizational Information - Databases
Get your hands dirty cleaning data European EMu Users Meeting, 3rd June. - Elizabeth Bruton, Museum of the History of Science, Oxford
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Term 2, 2011 Week 1. CONTENTS Problem-solving methodology Programming and scripting languages – Programming languages Programming languages – Scripting.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
ITGS Databases.
Clinical Collaboration Platform Overview ST Electronics (Training & Simulation Systems) 8 September 2009 Research Enablers  Consulting  Open Standards.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
OFC291 Microsoft® Office Word XML (part 1 of 3): Introduction Martin Sawicki Lead Program Manager.
IT Applications Theory Slideshows Databases II: Structure, Naming, data types, data formats.
Database Management Systems (DBMS)
Getting Ready for STEVE Mapping Tools for STEVE William R. Bolton, Jr. State Registrar and Director Division of Vital Records Administration New Hampshire.
© 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.
What is a Computer An electronic, digital device that stores and processes information. A machine that accepts input, processes it according to specified.
Exeter – Implementation of a Crosswalk Connector S. Trowell, University of Exeter Nov 2013.
©NIIT BCP and DTS Implementing Stored Procedures Lesson 2A / Slide 1 of 23 Objectives In this lesson, you will learn to: Perform bulk copy using the BCP.
Data Warehousing HOWTO ● What is a Data Warehouse? ● The organisational imperitive? ● How to build a data warehouse? – Evan Leybourn – Director – Looking.
Data Resource Management MGMT 4170 Lally School of Management Data Structures in Organizations.
Introduction to DBMS Purpose of Database Systems View of Data
Introduction To DBMS.
Chapter (12) – Old Version
Multi-Axis Tabular Loads in ANSYS Workbench
Overview of MDM Site Hub
Microsoft Access 2003 Illustrated Complete
Using Shiny to Efficiently Process Survey Data Carl Ganz, Akbar Akbari Esfahani, Hongjian Yu & Ninez Ponce UCLA Center for Health Policy Research Company.
Chapter 1 Database Systems
Database Vs. Data Warehouse
Experience with XML – based production of publications Case of « Statistical yearbook 2005 and 2006  » Guy Zacharias Centralisation et Diffusion STATEC.
Introduction to DBMS Purpose of Database Systems View of Data
TES Data Platform Providing business users with the tools to connect share and analyse data 2018.
Chapter 1 Database Systems
The ultimate in data organization
Implementation of physical data model
COMP3357 Managing Cyber Risk
Practical Database Design and Tuning Objectives
Best Practices in Higher Education Student Data Warehousing Forum
Introduction to reference metadata and quality reporting
Presentation transcript:

Data Transformation for Analysis Purposes Presented By: Gregg Ravenscroft Khulisa Management Services Tel: (011)

Scenario Organisation faced with disparate data sets that are: Decommissioned systems Multiple systems In a format non-conducive to analysis But… Information in data sets needed for analysis Data structure not allowing analysis Information is available but inaccessible

Data Transformation Goals Overcome challenge of variable underlying data set structures through Creating a uniform, integrated data set that allows for timely and easily accessible reports Integrating data needs according to a central schema

Typical Ways Data Sets Vary Non-standardised table and field names where information or content is similar Differences in way data is stored within data sets (fields entered as text information, while others are numeric or code designated) Similar naming conventions in data sets for different information

Step One Solution Evaluate different data sets Focus on data base structure to respond to organisation’s reporting requirements Collaborate on an ideal data structure designed for ease of analysis Involve stakeholders and ensure buy-in Design new business processes

Step One Solution (cont) Acquire the data Maintain the integrity of the data sets Ensure transfer process maintains reliability and validity of data

Step Two Data Extraction Uncomplicated extraction: Importing an excel spreadsheet from an MS Excel file Converting word documents to Excel and then exporting spreadsheet Importing a C.S.V file Complicated extraction: Setting up relationships to external data systems such as Oracle, MS SQL and PostgreSQL

Step Three Transformation Utilise a third party system Follow schematic outline agreed with stakeholders Investigate the process of converting data formats though use of a data dictionary Use dimension and mapping system

Step Three Transformation (cont) Map information in data sets Take account of inherent dimensions Specify how the data will fit into the refined output data set/s Design internal checks to: Minimise mapping errors Reject incorrect mappings Transformation system does not store data, “translates” data from source to destination

Step Four - Loading Process where data is ‘deposited’ into a data warehouse (postgrSQL allows for efficient storage) Load process done through series of SQL scripts Loading process has series of checks that to ensure all data from source can be accounted for at destination

Conclusion & Questions Data Transformation is Vital to data analysis across programmes Essential to optimise use of current (multiple) data sets But… Requires a high level of data base expertise and scripting ability Questions