Download presentation
Presentation is loading. Please wait.
1
SAS® Data Integration Solution
Gary Gray, Solutions Specialist
2
Data Integration Tools
SAS Positioned as a Leader Gartner Magic Quadrant Data Integration Tools September, 2008 Data Quality Tools May, 2008 Figure 1. Magic Quadrant for Data Quality Tools The Magic Quadrant is copyrighted 2008 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner’s analysis of how certain vendors measure against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the “Leaders” quadrant. The Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This Magic Quadrant graphic was published by Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report. The Gartner report is available upon request from DataFlux. Source: Magic Quadrant for Data Quality Tools, 4 June 2008, Ted Friedman, Andy Bitterer.
3
The Enterprise Intelligence Platform
M E T A D A T A Copyright © 2005, SAS Institute Inc. All rights reserved.
4
Data Integration What is it ?
Extract, transform, & load data From any source to any target Includes data movement Results in accurate, cleansed data Timely delivery, ready for the business ETL stands for Extract, transform and load data. It involves pulling data from any data source and loading it into any target. It includes movement of the data and transformation of the data to meet requirements….usually integrated data in a particular data model in a particular database or structure so it can be delivered in a timely fashion to serve needs of the business
5
Data Integration Initiatives
Every organization will at some point have to address these types of initiatives or programs to integrate their data resources. Organizations need to prepare to support such activities with a solution that helps in all cases. Where rules can be re-used as you carry out the projects and where skills can be re-used. Organizations need to find a strategic supplier to meet these needs. In order to carry out many of these programs, you will require ETL capabilities, either in standalone fashion or combined with other technology. <Click> Strategically, the ETL capabilities should provide a complete set of shared reusable data integration services, addressing all the organization’s data integration needs. This approach ensures consistency and reliability. There should be unified metadata at the heart of the solution, with interoperability with existing enterprise metadata stores. <click> The solution should provide access to all data sources, both in the data warehouse environment and in the operational environment. From relational database management systems such as Oracle, DB2, Teradata, and SQL server, to PC file formats, to realtime message queues, to things such as flat files, mainframe data and XML.
6
The SAS Data Integration (DI) Solution
Data integration platform to quickly build, deploy and manage metadata driven DI process flows Perform in-depth transformations with minimal programming Leverage >300 BI Specific built-in functions Identify and clean dirty data prior to loading Seamlessly load data into optimized data storage Scalable to leverage I.T. computing infrastructure including Grid enablement Interoperable through open standards
7
SAS® Data Integration Studio
Visual/Multi-User Design Tool for building, implementing and managing ETL processes from source to destination DataFlux Data Quality Integration improve quality and productivity by analyzing data before writing validation code Unrivaled data access capabilities getting to the right data quickly and easily Message Queue Integration wizard-driven support for both reading and writing for EAI architecture Robust Deployment/Web Services enabled deploy ETL processes as services for easy deployment in a service oriented architecture
8
Making Data a Valuable Asset Data Quality Integration Platform
9
Single Data Quality Platform
GUI Design Environment Batch and Real-Time Execution Integration dfPower Studio Integration Server dfConnectors DataFlux Adaptors Informatica Siebel SAP SAS OWB Profile Quality Integration Augmentation Monitor SOA Batch Real Time Single Data Quality and Integration Engine Data Access Profiling Rule Validation Parsing Standardization Match/Consolidation Householding Address Quality Geocoding Demographic Discovery Phone Validation Much more The DataFlux solution is designed to solve this complex problem. The solution consists of multiple products that work together to form a single enterprise class data quality solution. Starting from the ground up, DataFlux has built a single data quality and data integration engine. This blue bar is designed to illustrate this single engine. The engine is then exposed through different products, each one solving different types of data quality problems for different types of users. dfPower Studio is designed for your business and IT analysts. The engine supports data profiling and integration via an intuitive point and click user interface. This solutions allows users to connect directly to their data sources and automatically discover and correct data quality issues. The Integration Server is designed so support batch and real-time processing of data in larger volume environments. Any workflow created in dfPower can be executed via a Web Service and executed on a batch set of data or on a transactional record Connectors expose the exact same engine through different enterprise applications. Informatica exposes DataFlux data quality algorithms within the Informatica GUI interface Siebel and SAP integrations offer real-time de-duplication and postal validation within the enterprise application SAS and OWB expose the DQ engine within these environments. Today we will focus on dfPower Studio as it is typically used as the design environment for all DQ processes. Quality Knowledge Base GUI Based Customization
10
Quality Knowledge Base
Batch Processing Data Management Process Profile Quality Integrate Enrich Monitor dfPower Studio Templates map the process Business rules defined for reuse later dfPower Studio DataFlux Integration Server Templates BATCH SOA Environment Business Services ERP System CRM System Legacy System Quality Knowledge Base Reference Databases Enterprise Integration Suite REAL-TIME
11
Quality Knowledge Base
Real-Time Processing Services modified using dfPower Studio Web Services interface for simple integration with existing applications via SOA Native API’s (C++, Java, COM, .NET, Perl) also supported Batch business rules reused in Real-Time Batch and Real-Time servers can be deployed to handle load dfPower Studio DataFlux Integration Server Templates BATCH REAL-TIME SOA Environment Web Services Business Services Web App ERP System Quality Knowledge Base Reference Databases Enterprise Integration Suite CRM System
12
Solutions vs. Accelerators
Core Data Management Technology DQIS PLATFORM Service Template Bundles and Associated Components ACCELERATORS MDM SOLUTIONS Services Templates, Data Models, Web Applications
13
Data Integration Developer Quality Knowledge Base
SAS® Roles Data Steward Data Integration Developer SAS Data Integration Studio Quality Knowledge Base dfPower Studio Business focus Technology focus
14
Copyright © 2006, SAS Institute Inc. All rights reserved.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.