Analytics Warehouse P.J. Kelly.

Slides:



Advertisements
Similar presentations
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
Advertisements

Migration From Mainframe to SAS Enterprise Guide Reporting Migration From Mainframe to SAS Enterprise Guide Reporting A New Retention Process in IR Office.
Connecting Knowledge Silos using Federated Text Mining Guy Singh Senior Manager, Product & Strategic Alliances ©2014 Linguamatics Ltd.
INTEGRATING BIG DATA TECHNOLOGY INTO LEGACY SYSTEMS Robert Cooley, Ph.D.CodeFreeze 1/16/2014.
CX Analytics: Best Practices in Measuring For Success
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
An Information Architecture for Hadoop Mark Samson – Systems Engineer, Cloudera.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
Accelerate Business Success With CRM CRM Interoperability.
DATA WAREHOUSING.
Altosoft Copyright ® 2012 altosoft.com8/3/2012 Sandy Follin, Sr. Account Executive Steve Schrader, Sr. Sales Engineer.
Business Intelligence System September 2013 BI.
Business Intelligence Technology and Career Options Paul Boal Director - Data Management Mercy ( April 7, 2014.
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
Business Intelligence
Microsoft Office SharePoint Server Business Intelligence Tom Rizzo Director, Microsoft Office SharePoint Server
A Comparsion of Databases and Data Warehouses Name: Liliana Livorová Subject: Distributed Data Processing.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Understanding Data Warehousing
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Virtual techdays INDIA │ November 2010 PowerPivot for Excel 2010 and SharePoint 2010 Joy Rathnayake │ MVP.
1 The following presentation is from the Oracle Webcast “What’s New in P6 EPPM Release 8.1.” As a partner, you may not use the Oracle Power Point template,
material assembled from the web pages at
More ETL. ETL in a nutshell ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to –extract data, mostly from.
Creating New Business Value with Big Data Attivio Active Intelligence Engine®
Using SAS® Information Map Studio
Enterprise Reporting Solution
ADFG Commercial Fisheries Data Warehouse and Business Intelligence Project.
Instant Information Access With Magnify Search Dr. Rado Kotorov Technical Director Strategic Product Mgt.
Reporting and Analysis With Microsoft Office. Reporting and Analysis Business User Reporting & Analysis OLAP Data Warehouse.
Rajesh Bhat Director, PLM Analytics Applications
Self-Service Data Integration with Power Query Stéphane Fréchette.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
© David L. Wells Integrating Analytics into Business Intelligence Dave Wells
Metadata Driven Clinical Data Integration – Integral to Clinical Analytics April 11, 2016 Kalyan Gopalakrishnan, Priya Shetty Intelent Inc. Sudeep Pattnaik,
Microsoft Power BI Stack
BI TOOLS FOR THE IBM I …AND BEYOND A quick look at IBM’s DB2 Web Query for the i and WebFocus.
Power BI Solutions for California Colleges
Data Platform and Analytics Foundational Training
Leveraging the Business Intelligence Features in SharePoint 2010
Reporting and Analysis With Microsoft Office
Zhangxi Lin, The Rawls College,
How to build a successful Data Lake
A Big Data Cheat Sheet: The Big Pharma Edition
MIS5101: Extract, Transform, Load (ETL)
Data Warehouse.
Pentaho 7.1.
Creating New Business Value with Big Data
Operationalize your data lake Accelerate business insight
Business Intelligence for Project Server/Online
SAP LUMIRA Jeffrey Hook Giovanni Navarro Mason Wong Michelle Zuckerman.
MIS5101: Extract, Transform, Load (ETL)
Chapter 1 Database Systems
MIS5101: Extract, Transform, Load (ETL)
Enhancing ICPSR metadata with DDI-Lifecycle
Azure Data Lake for First Time Swimmers
Data Warehousing Concepts
Introducing Power BI dataflows
Power BI – Introduction to Dataflows
Oracle’s Reporting Strategy
Get data insights faster with Data Wrangling
Data Wrangling as the key to success with Data Lake
Data Wrangling for ETL enthusiasts
Microsoft Azure Data Catalog
Customer 360.
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Get your data flowing with Data Flows! and...umm...dataflows.
Architecture of modern data warehouse
Presentation transcript:

Analytics Warehouse P.J. Kelly

Data Warehouse vs. Data Lake* Data Warehouse (BI) Data Lake (Analytics) Data Structured Structured, semi-structured, unstructured Storage Expensive Low-cost, commodity hardware Agility Fixed Agile, reconfigurable Security Mature Maturing Users Business Professionals Data Scientists, Analysts *Tamara Dull, Director of Emerging Technologies, SAS Institute, blog August 28, 2015

Our Analytics approach. Data Exploration Tools Metadata Data Transformation Compatibility Data Modelling

Data Exploration – Previous There are limitations to how much data we could store in our existing SAS environment given it sits on a single server. Data context is solely determined from the library and dataset name. Additional datasets require discovery and lead time to determine suitability, extract it from the source system and ship it across manually.

Data Exploration

Data Exploration If the library names and table names are meaningless to you, Analysts new to the data have the same experience!

Data Exploration - Hadoop Hadoop can scale to accommodate most or all of Revenues data sources. Including sources in formats not readily compatible with previous systems. Once the data is in Hadoop it can be given context through the metadata features. Analysts looking for particular insight can essentially “google” their data.

Data Exploration – Hadoop

Data Exploration – Hadoop Search on tax head, source system or find data underlying specific dashboards. If there’s a gap in the metadata, you can fill it. We can tailor our metadata approach to analyst’s requirements.

Data Exploration - Hadoop

Data Exp. – Workflow - Hadoop Do you know if the data is already in Hadoop? Do you know where? Analysts Analytics Concept ATS Analysts ATS If navigator is populated your search ends there. If not, at multiple stages in the process the insight captured can be added to the metadata stores and shortens the search for any future analysts.

Data Transformation The data exists in a raw or unprocessed state. The total data that’s needed is spread across multiple tables, source systems etc. It needs to be joined, filtered and cleaned up.

Data Transformation - SAS E.Guide Excellent for building ad-hoc processes to explore how best to transform data for a particular use. Extensive existing experience within Revenue and a relatively low learning curve for data analysts. Transformations have to be recreated from scratch to be “productionised”. It’s an analyst tool.

Data Transformations - Hadoop Suitable for ad-hoc queries and building production end to end Extraction, Transformation and Loading (ETL) processes. Multiple toolsets and code types to choose from. Any transformations are immediately accessible in E.Guide through an official SAS/Access Hadoop connector. It’s an Analyst and Developer tool.

Data Transformation - Hadoop

Data Transformation - Hadoop Hadoop isn’t a single toolset. You can read and write to the same datasets using a number of different methods including SAS Enterprise Guide. Two analysts can collaborate on the same data – one using native code in Hadoop , one using the drag and drop interface on SAS.

Data Transformation A great deal of work is done processing raw data into states usable by reports and models. This processing can obfuscate the origin and true meaning of data and make it difficult to reuse. Only project documentation, speaking to the original developer or unravelling the live build process exist as options currently.

Data Lineage Any development done within the Analytics Warehouse will have full Data Lineage Tracking While complex it tracks every stage of transformation and maintains a full picture of where the data has come from.

Data Lineage

Data Lineage

ETL and Analytics There can be no Analytics without ETL. Presenting the data in a processed, model ready format, with rich metadata and the ability to track every dataset through every transformation frees the Analysts up for the highly skilled process of creating the end product.

ETL and Analytics ETL engineers within ICT&L focus on building industrialised, metadata rich data assets with a long term focus. Analysts focus on using these assets to their full potential with no technological restrictions.

Adv. Analytics - Toolsets Hadoop is constantly evolving with the field. A multitude of open source tools currently released or being developed can be seamlessly added with zero compatibility issues. All of these tools can read from and write to the same datasets. Allowing collaboration between different analysts using different toolsets.

Thank you