Presentation is loading. Please wait.

Presentation is loading. Please wait.

AFPOA Virtual Vendor Day Topic: Data Integration Gregory J

Similar presentations


Presentation on theme: "AFPOA Virtual Vendor Day Topic: Data Integration Gregory J"— Presentation transcript:

1 AFPOA Virtual Vendor Day Topic: Data Integration Gregory J
AFPOA Virtual Vendor Day Topic: Data Integration Gregory J. Vaughan – Executive Consultant, WW Military and Defense Lead, Information Agenda Tiger Team

2 There’s no “easy button” for this…
Data Integration is a complex problem A myopic view of the problem frustrates the desired end state Scoping the problem too narrowly reduces the likelihood of success Focusing later on data integration requires a revisit of the problem scope Data integration presents the greatest risk to IT related business initiatives Data Governance is required, but frequently overlooked The complexities of data integration requires a comprehensive solution

3 Solution Architecture – General View
Define & Govern Operational Systems/Data APPLICATIONS INTERNAL DATABASES EXTERNAL DATABASES BI (REPORTS, DASHBOARDS, QUERY, OLAP) Analytics PREDICTIVE ANALYTICS TEXT ANALYTICS OPTIMIZATION OLAP CUBES DATA WAREHOUSE DATA MARTS Analytical Information UNSTRUCTURE CONTENT METADATA OPERATIONAL DATA MASTER DATA Trusted Information Info. Integration Data Quality Info. Services USERS INTERNAL/ EXTERNAL DATBASES

4 The IBM Solution: IBM Information Server Delivering information you can trust
Unified Deployment Understand Cleanse Transform Deliver Discover, model, and govern information structure and content Standardize, merge, and correct information Combine and restructure information for new uses Synchronize, virtualize and move information for in-line delivery Key Point: The culmination of these efforts has led us to our latest platform offering – the IBM Information Server. IBM Information Server is a revolutionary new software platform from IBM that helps organizations derive more value from the complex, heterogeneous information spread across their systems. It enables organizations to integrate disparate data and deliver trusted information wherever and whenever needed, in line and in context, to specific people, applications, and processes. IBM Information Server helps business and IT personnel to collaborate to understand the meaning, structure, and content of any type of information across any sources. It provides breakthrough productivity and performance for cleansing, transforming, and moving this information consistently and securely throughout the enterprise, so it can be accessed and used in new ways to drive innovation, increase operational efficiency, and lower risk. IBM Information Server is designed to help companies leverage their information across all its sources. IBM Information Server delivers all of the functions required to integrate, enrich and deliver information you can trust for your key business initiatives. IBM Information Server allows you to: Understand all sources of information within the business, analyzing its usage, quality, and relationships Cleanse it to assure its quality and consistency Transform it to provide enriched and tailored information, and; Federate it to make it accessible to people, processes, and applications IBM Information Server provides: access to the broadest range of information sources the broadest range of integration functionality, including federation, ETL, in-line transformation, replication, and event publishing the most flexibility in how these functions are used, including support for service-oriented architectures, event-driven processing, scheduled batch processing, and even standard APIs like SQL and Java. The breadth and flexibility of the platform enable it to address many types of business problems and meet the requirements of many types of projects. This optimizes the opportunities for reuse, leading to faster project cycles, better information consistency, and stronger information governance. Regarding Service-Oriented Architectures, information integration enables information to be made available as a service, publishing consistent, reusable services for information that make it easier for processes to get the information they need from across a heterogeneous landscape. Unified Metadata Management Parallel Processing Rich Connectivity to Applications, Data, and Content

5 Enterprise Architects Subject Matter Experts
Align business and IT objectives using single platform that creates trusted information for use in key initiatives Executives Enterprise Architects Sources Business Analysts Subject Matter Experts Business Initiatives legacy Data Analysts & Architects BI apps dbs SAP warehouse Xls., xml, flat mdm warehouse z/OS DBA Data Steward custom Developer ERP System Manager System Architect

6 Enterprise Architects Subject Matter Experts
Align business and IT objectives using single platform that creates trusted information for use in key initiatives Executives Enterprise Architects Sources Business Analysts Subject Matter Experts Business Initiatives legacy Data Analysts & Architects BI apps dbs SAP warehouse Xls., xml, flat mdm warehouse z/OS DBA Data Steward custom Developer ERP System Manager System Architect

7 InfoSphere Information Analyzer
Requirements Analyze source data quality and monitor adherence to integration and quality rules Perform data quality assessment Define business rules to monitor data quality Establish stewards for governance of data quality Benefits infoSphere information Analyzer does just what you would expect based on it’s name. This product was built specifically to understand the level of data quality you have and to ensure that the data remains of high quality across your enterprise. Most people expect the data they have is already high quality. We want to ensure that that is actually true based on the data. Have you ever thought about performing a Data Quality Assessment? Most of our clients find this process very insightful when they are looking at their data. What you end up with is a report about the data values and the frequency that those data values are found. You can see that this will quickly show you what type of data you have so you can make important decisions on what to do with it. Additionally, when you find data that you haven’t expected or is not valid. You can measure this through our data rules over time. In this example we are reviewing whether or not the rules have passed or failed the quality test. Keeping the information you have trusted is important to your business and IBM has the technology to help. What’s next? Identify data quality issues early to reduce project risks Monitor quality metrics over time for compliance Create business confidence with trusted information 7 7

8 InfoSphere Business Glossary
Requirements Business Glossary Create and manage business vocabulary and relationships and related to physical sources Capture business terms and classifications Link business terms and classifications to IT assets Identify data stewards and make glossary accessible Benefits Business metadata is recorded in InfoSphere Business Glossary. Business Glossary provides a Web-based tool for authoring, managing, and sharing business metadata. This tool is designed for business users and subject-matter experts to define data stewards and record business terminology definitions and hierarchies. As you can see from the example at the bottom left of the slide, business and technical views of information are very different. The technical view is focused on defining the structure and location of information, while the business view is focused on the usage and characteristics of information, and the rules that govern it. For example, multiple systems may maintain tables of customer information, however the business may uncover a requirement for the concept of “high-value” customers. The business needs a way to define what a high value customer is, and how to recognize them (e.g. a high-value customer is a customer with combined account balances over $1MK). Business glossary provides a tool for recording these definitions, and relating business concepts together into taxonomies. This records the business requirements in the same metadata foundation that the profiling and analysis process uses. Silver bullets: Provides comprehensive management of data stewards, terms, and taxonomies Allows users across the organization to share and collaborate on business definitions through a Web-based interface Shares information in the common metadata framework,making it available to other tools and users throughout the organization Context for information is available to everyone, immediately IT projects are aligned with data governance Collaboration increases across business and IT 8 8 8 8

9 Enterprise Architects Subject Matter Experts
Align business and IT objectives using single platform that creates trusted information for use in key initiatives Executives Enterprise Architects Sources Business Analysts Subject Matter Experts Business Initiatives legacy Data Analysts & Architects BI apps dbs SAP warehouse Xls., xml, flat mdm warehouse z/OS DBA Data Steward custom Developer ERP System Manager System Architect

10 InfoSphere QualityStage
Requirements QualityStage Resolution of data quality issues Standardization of data formats Cleanse data Manage duplicate data Enable ongoing quality Standardize, cleanse and deduplicate data, ensuring a complete, accurate view of information Benefits Removes duplicates Cross-references matching records Survives a single, complete record Validate and enriches data InfoSphere QualityStage is our data cleansing, data standardization, data matching and data validation solution in the Information Server. When customers have information coming in from a variety of sources they typically need to synthesize all of the data they have into a common format or standard for their target environment. InfoSphere QualityStage is built just for that business purpose. Whether you are removing duplicates from the data you have or you are merging multiple systems into one system you will need an approach to matching all of that data together. But before you can get the best matching result that is possible you must create a standard for the data InfoSphere QualityStage will set a standard for name, address, product , location or any other data you choose to process. Once the standard is in place you will have the ability to find all of the duplicates within the data and make a decision what to do with the result. Typically our clients will merge them into a single view, create the best location data possible and the result of this process is data you can trust. Your systems demand accuracy and QualityStage is built to provide it. <NEXT> 10 10 10

11 Enterprise Architects Subject Matter Experts
Align business and IT objectives using single platform that creates trusted information for use in key initiatives Executives Enterprise Architects Sources Business Analysts Subject Matter Experts Business Initiatives legacy Data Analysts & Architects BI apps dbs SAP warehouse Xls., xml, flat mdm warehouse z/OS DBA Data Steward custom Developer ERP System Manager System Architect

12 InfoSphere Metadata Workbench
Requirements Metadata Workbench Support information governance with traceability on data movement, modeling & BI applications Handle Change Management processes with measured impact. Visualize and trace information flows across enterprise landscape Access and report on operational and design metadata Benefits InfoSphere Metadata Workbench was designed to help you get the answers you need in a very simple and targeted approach. Remember the discussion at the beginning of this presentation with the Central Metadata. We have created a single metadata environment so you could have a single reporting approach that answers every type of questions. The common questions can include. Where did this data come from? Where is it going? Who built this application? Did it run correctly? How many rows of data went from A to B to C? These are the types of analysis that our clients expect to answer. This helps them understand and document their information assets as well as their Integration landscape. Think about this operationally for a moment. What if something is wrong on my BI report and my manager is demanding an answer. Can I get that answer quickly and accurately is what the business demands. With InfoSphere Metadata Workbench you can quickly analyze the data flows, the changes that occurred to them and whether or not they have run to completion. That’s a lot of information at your fingertips. <Next> Deliver enterprise audit control information. Mediate system disruptions. Govern enterprise assets over time. Ensure effective collaboration with line of business stakeholders. 12 12

13 Enterprise Architects Subject Matter Experts
Align business and IT objectives using single platform that creates trusted information for use in key initiatives Executives Enterprise Architects Sources Business Analysts Subject Matter Experts Business Initiatives legacy Data Analysts & Architects BI apps dbs SAP warehouse Xls., xml, flat mdm warehouse z/OS DBA Data Steward custom Developer ERP System Manager System Architect

14 InfoSphere Data Architect
Requirements Data Architect Design and manage enterprise models Enforce model conformance to enterprise standards Leverage industry data models for best practices Model, visualize, and relate diverse and distributed data assets Benefits Speed design activities Populate Business Glossary from model terms Validate models for enterprise conformance InfoSphere Data Architect helps simplify data modeling and integration design activities by enabling architects to discover, model, visualize and relate diverse and distributed data assets. As part of an executing your Information Agenda strategy, you must have a solid foundation upon which your application architecture relies for its data storage. InfoSphere Data Architect helps organization model and optimize new and existing sources to support business objectives. InfoSphere Data Architect acts as a modeling gateway to the InfoSphere Foundation Tools, interchanging glossary and physical metadata to enhance collaboration between team members. The tight integration between the IBM Industry Models, InfoSphere Data Architect and InfoSphere Foundation Tools allows organizations to exploit industry-specific business and technical metadata to accelerate data integration projects. For example, organizations can leverage the IBM Industry Models as a starting point for projects. They have the option to customize these data models and add additional specific business terms from within InfoSphere Data Architect which can then be shared across the enterprise using the Foundation Tools. Data Architect lets you: Create logical and physical data models, including privilege models Define data attributes, including domain constraints and privacy attributes Discover, explore, and visualize the structure of data sources Compare and synchronize the structure of two data sources InfoSphere Data Architect is more than a data modeling tool. It is also a: -documentation tool. It helps you to create diagrams of existing database structures -Information Integration tool. Helps to define federation concepts -XML mapping tool. Map database schemas to SOA structures -Code Development tool. Create valid DB2 SQL code. IBM Data Studio is the product that does all this outside of IDA. -Traceability tool. Know why, what and when for every change. New release features integrations with IBM Rational Software Architect, Eclipse 3.2 and IBM Information Server; additional mappings and expanded support for XML, DB2 V9, Sybase, Informix and mySQL. InfoSphere Data Architect is differentiated from other data modeling tools based on its lifecycle integration that enhances productivity, quality, and governance Leverage integration with the Rational Software Delivery Platform, InfoSphere, Data Studio, Optim, and IBM Industry Models Application lifecycle – leverage InfoSphere Data Architect with the Rational Software Delivery platform providing seamless integration across application, process, and data models. In addition a common repository for all project artifacts provides built-in team collaboration across business analysts, architects, developers, and administrators. Data lifecycle – leverage InfoSphere Data Architect with the Integrated Data Management IBM Optim portfolio to manage data, databases, and data-driven applications from requirements to retirement Trusted data – leverage InfoSphere Data Architect with the IBM InfoSphere, Industry Models and Cognos portfolios to deliver trusted information assets 14 14

15 InfoSphere FastTrack Requirements Benefits
Capture Design Specifications and accelerate translation into data integration projects Capture business requirements for source to target mappings Leverage source analysis and business vocabulary Generate candidate ETL jobs Benefits Accelerate development of integration processes Centralized management of specifications Audit design decisions over time InfoSphere FastTrack enables the centralization and tracking of all business specification requirements from inception through design and eventual fulfillment. These specifications include the business logic required to translate source data into a consumable format for a target application. For example, defining a mathematical calculation for populating a profitability column in a data warehouse. These business mapping requirements can then be re-used and serve as an audit trail for design decisions made during the development process or provide historical reporting. InfoSphere FastTrack can also translate these business requirements into InfoSphere integration jobs, bridging the gap between the business analyst, data modeler and the integration developer. 15 15

16 IBM InfoSphere Optim Data Masking Solution
Understand & Define Monitor & Audit Secure & Protect Information Governance Core Disciplines Security and Privacy De-identify sensitive information with realistic but fictional data for testing & development purposes Requirements Protect confidential data used in test, training & development systems Implement proven data masking techniques Support compliance with privacy regulations Solution supports custom & packaged ERP applications JASON MICHAELS ROBERT SMITH InfoSphere Optim Data Masking Solution protects an organization’s data in non-production environments by de-identifying (or masking) sensitive/personal identifiable date. The Optim solution doesn’t keep the data from being stolen, but rather render the data unusable and of no value if stolen. This protects the business both financially and from loss of information and provides IT with a simple-to-use solution that supports a common way of protecting data leveraged in non-production (test, development) environments, or by third-party contractors. The Optim Data Masking Solution comes with a multitude of built in masking functions, as well as the ability to define your own transformations. There is no longer a reason to needlessly expose your sensitive data in your test environments ever again. Benefits Protect sensitive information from misuse and fraud Prevent data breaches and associated fines Achieve better data governance Personal identifiable information is masked with realistic but fictional data for testing & development purposes.

17 IBM InfoSphere Optim Test Data Management Solution
Understand & Define Monitor & Audit Secure & Protect Information Governance Core Disciplines Security and Privacy Test Data Management Create “right-size” production-like environments for application testing Requirements Create referentially intact, “right-sized” test databases Automate test result comparisons to identify hidden errors Shorten iterative testing cycles and accelerate time to market 2TB Subset & Mask 25 GB Production or Production Clone Benefits Creating realistic application development and testing environments is critical to delivering the right solutions for the business. However, cloning large production databases for development and testing purposes extends cycle times, increases the amount of data propagated across the organization, and significantly raises costs and governance control issues. The Optim Test Data Management Solution offers proven technology to optimize and automate processes that create and manage data in non-production (testing, development and training) environments. Development and testing teams can create realistic, “right-sized” test databases, made up of one or more business objects, for targeted test scenarios. The Optim Test Data Management Solution also allows teams to easily compare the data from “before” and “after” testing with speed and accuracy. Optim’s capabilities for creating and managing test data enable organizations to save valuable processing time, ensure consistency and reduce costs throughout the application lifecycle. 25 GB Development Deploy new functionality more quickly and with improved quality Easily refresh & maintain test environments Reduce storage and operational costs Unit Test 100 GB 50 GB Training Integration Test InfoSphere Optim TDM supports data on distributed platforms (LUW) and z/OS. Out-of-the-box subset support for packaged applications ERP/CRM solutions as well as : Other

18 Guardium: Full Lifecycle of Database Security & Compliance
1818 Guardium: Full Lifecycle of Database Security & Compliance A 2010 Verizon study said 92% of data that’s breached in an environment happened at the back-end data layer…The focus on security has been heavy towards the network layer IBM’s Guardium solution is is real-time DB security and monitoring for underneath the network… Works by applying an agent to the kernal of the OS, acting as the chokepoint on all sql flow of data to and from DB Very low impact on server, approx 2% Slide Points… Guardium Customers: FBI Terrorist Center, Dept Justice, IRS, Federal Reserve, FDICs

19 Best Practices Capabilities & Differentiators
Single data integration platform with multiple components Consistent and repeatable methodology for mitigating risks Industry leading Probabilistic Matching Engine for data standardization jobs Native Parallel Processing Engine for scalability Shared GUI Interface between major components of the platform Centralized repository of critical metadata shared across the platform Data integration enablement in an SOA environment

20 IBM Information Server Federal Customers
Agency data migrations Authoritative source Personnel record consolidation System synchronization Personnel and recruiting analysis Procurement system consolidation Real-time data management Inventory parts analysis

21 Questions?


Download ppt "AFPOA Virtual Vendor Day Topic: Data Integration Gregory J"

Similar presentations


Ads by Google