InfoSphere Metadata Comprehensive Deck

InfoSphere Metadata Comprehensive Deck
Last Updated

Agenda 2008 Metadata Market Survey
InfoSphere Information Server Metadata Approach InfoSphere Business Glossary InfoSphere Metadata Workbench Third Party Metadata Cognos Integration Case Studies

2008 Metadata Market Survey
IBM and Gavilan Research Associates (GRA)

IBM and Gavilan 2008 Survey Highlights: Turning your Data into Information with Metadata
Top 5 areas companies said they need to improve their metadata: Documenting business metadata (business rules & element descriptions) % of respondents Support for Data Governance and Data Stewardship (67.5% of respondents) Documenting a Data Warehouse, Data Mart, or Analytical Application (64.8% of respondents) Support for Data Lineage reporting and/or Data Mapping analysis (64.9%) Implementing a corporate Data Dictionary or Data Glossary (64.7%) Of those surveyed: Less than 7% said their implementation was highly successful ! 220 Global 5000 companies surveyed in 2007 IBM & Gavilan Research Associates 2008 Metadata Survey – 287 respondents

What Are Your Top Metadata Management Issues ?
Respondents could choose multiple answers, which created a broad spectrum. The “top 5” metadata management issues in 2008 are: documenting business metadata (business rules & element descriptions) % of respondents support for Data Governance and Data Stewardship (67.5% of respondents) documenting a Data Warehouse, Data Mart, or Analytical Application (64.8% of respondents) support for Data Lineage reporting and/or Data Mapping analysis (64.9%) implementing a corporate Data Dictionary or Data Glossary (64.7%) Respondent Comments: “My top issue is getting metadata out of existing toolsets into a repository, where it can be accessed and reported on.” “Change Management is our biggest issue! How could you miss this point?” Source: IBM / Gavilan Research Associates 2008 MetaData Market Survey

InfoSphere Information Server Metadata Approach

Metadata enables you to put context and meaning to things.
With a trademark label Stored in a can So what exactly is Metadata? Made by Since Type of food With a special ingredient With many varieties This slides is designed to illustrate the metadata while not always recognized, drives everything we touch and use. Even a can of beans. Without the label on this can, you have no idea what you will get when you open the top. Metadata within integration projects is just as critical to understand the design, usage and relationships between application objects. If you turn this case around, you would see additional information about its: Nutritional value ingredients UPC label Price There is also additional information and metadata about this can which is tracked from its creation to its shipment to the store. This includes: Packaging (case or pallet) Shipping information – where from and to and on what dates RFID tag on pallet Market basket analysis - other meals you might each with beans Metadata enables you to put context and meaning to things. It is generated and consumed by every organization and software product.

Metadata Primer for Business
Business Metadata Business rules, Stewardship, Business Definitions, Auditing Terminology, Glossaries, Algorithms and Lineage using business language. Audience: Business users. Technical Metadata Defines Source and Target systems, their Table and Fields structures and attributes, Documentation for Auditing Derivations and Dependencies. Audience: Specific Tool Users – BI, ETL, Profiling, Modeling. Operational Metadata Information about application runs: their frequency, record counts, component by component analysis and other statistics for auditing purposes. Audience: Operations, Management and Business Users. T O Literally, “data about data” that describes your company’s information from both a business and a technical perspective

Let’s put Metadata to an Example
Suppose you have these items of “data” $233,567.00 MetaData is what describes, defines, & inter-relates your key information assets “Social Security Number: a unique identification number issued to citizens, permanent residents, and temporary (working) residents by the US Government ...” pic XXX-XX-XXXX “Profit Amount: a currency value that is calculated by combining data from the Customer Master database and Wholesale Inventory applications . . .” Calculation included on Monthly Report 9

Metadata helps answer important questions such as:
What data or information exists ? Where is it being used ? What is its business definition ? What other names has it been called or is being called ? How is it inter-related to other information ? Who is using it ? Why do we need it ? When was it last updated ? What is “Profit Amount” ? How is it defined ? How is it calculated ? Where is it stored or used ? Is it reliable ? Accurate ? If I make a change to “Profit Amount” – what will be impacted ? 10

Metadata Business Drivers What is happening in the Data Integration Market that requires Metadata?
Governance and Compliance Regulations are increasing How do organizations comply and meet documentation requirements? How can organizations ensure accountability and responsibility? Business Competition continues to grow How do organizations individualize their customer experience? How can organizations get access to information to make correct decisions? Costs and system complexities are expanding How can organizations drive optimization with integration? How do organizations manage complex software environments? The data integration market is continuing to expand for multiple reasons. Each of these reasons is increasing the need to have better control over metadata to support corporate initiatives. Governance and Compliance regulations such as Basel II, Accord and Sarbanes-Oxley require organizations to understand, track and audit where their data came from and how its been transformed. All to have confidence in signing-off on high-profile reports. Business competition and pressures continue to grow. Organizations who have a handle on what information they have, what it means and how it is related to other organization assets have a better chance of creating the right marketing and business plans to beat their competition. If you don’t know who your customer is, how can you sell them more? As these organizations continue to grow, so does the cost and underlying system complexities. Costs cannot continue to grow exponentially but rather companies must find ways of supporting more with less and re-using what they have. Understanding how processes interact is key to administering and trouble-shooting issues that can occur in the middle of the night at 2 AM.

IBM Data Governance Maturity Model - measurable results
Business Transformation --Do not go blind trying to read the tiny text here -- IBM and a number of our customers have engaged in a pioneering effort in the area of data governance. We’ve formed the DG council under the leadership of Steve Adler. Council has thoroughly examined the whole playing field of DG and developed a maturity model that 1) defines the different aspects of DG as reflected by these 11 categories.well as the key indicators within each aspect and 2) a set of maturity indicators for these categories. Maturity model is patterned on the SW Eng’g Institute’s Capability Maturity Model. Having an assessment of where you currently are, and provides a framework of where you want to be. Based on goals and structure of an org drives which category to attack first and what maturity level they choose to achieve. Not everyone attacks all levels at the same time. And measurement is at a category level, not across the board. Where you are, and where you want to be, is the choice of your organization. When you have a maturity model, it goes along with an Assessment Process – and GBS has excellent practice around it 12 12

A Brief History of Metadata – Where did we come from?
Information Server Launched Community Focused Repositories … IBM buys Unicorn 2008 IBM buys Ascential $50 million to $100 million mainly maintenance and most organizations decided live with metadata definitions in multiple technologies, and manually synchronize 2007 2002 Single Enterprise Repository IBM led AD Cycle Federated Repositories And Registries e.g. Unicorn Understand how software and data services related to business processes workflows, and the requirement to publish, reuse and govern these services. 2000 1990 DataWarehouse Repositories ETL, DQ, DP e.g. Ascential Metastage Steve Born This slide shows how a history of metadata and how the solutions have evolved and transformed over the last 30 years. It started back in the early 1970’s with products like IBM’s Data Dictionary for the mainframe. In the 1980’s, the IBM AD cycle project was an attempt at a quorum of groups to consolidate metadata into a single enterprise repository. Then in the 1990’s, there was another attempt to build another consolidated repository with applied standards – unfortunately, with very limited success. Just before the turn of the century, people moved to community focused repositories as the number of applications within an enterprise began to sky-rocket. These were huge projects costing millions of $ to maintain and manage. In the early 21st century, an attempt to make things more containable and focused, there began several attempts at designing data warehousing specific repositories - for example with products like MetaStage (from Ascential). Then IBM decided to acquire Ascential for a new metadata foundation they were building for their 8.x release. In the meantime, federated metadata repositories started to appear at this same time driven by the increase in SOA projects and the need to promote these new enterprise services via registries. The company Unicorn had a federated repository so IBM acquired Unicorn for their in-depth engineering expertise and partnered them with the Ascential acquisition to complete their new unified data integration metadata architecture. Over this time, IBM has learned that metadata integration is significantly complex and difficult because it requires the organization to gather information spread across multiple processes, consolidate it into a single storage area and then reconnect the information with multiple user viewpoints to deliver additional value to downstream users and processes. While a difficult problem to solve, being able to instantly share information across not only multiple user roles and tasks but also across multiple integration processes and projects allows companies to take full advantage of their investments and streamline the overall efficiency of their development process. The IBM InfoSphere Information Server platform was designed to facilitate this type of meaningful integration, allowing organizations to focus on solving complex business problems rather than integrating separate profiling, cleansing and extract, transform and load (ETL) data technologies. 1980 Many Vendors Unable to agree on standard 800 page spec Impractical solution – too many cooks 1990 Consolidation to central repository using parsers, bridges and generators.Semi-automated and required admin + changes to system development life cycle methodologies one-third failed miserably;one-third succeeded only marginally (by using the repository for data architecture or a single key mission-critical application) One third demonstrated a good ROI 2000 $50 million to $100 million mainly maintenance and most organizations decided live with metadata definitions in multiple technologies, and coordinate and synchronize metadata across the tools in a more manual, multiprocess way. 2002 less-ambitious more-scoped metadata management solutionsleading tools supported bridging metadata between technologies such as data modelling/database design, data transformation (extraction, transformation and loading) and business intelligence tools SOA’s metadata related to business services, runtime workflow and orchestration technologies, and Java and .NET components led to an increased interest in metadata management. understand how software and data services related to business processes workflows, and the requirement to publish, reuse and govern these services. 1980 Metadata Consolidation Solutions 1970 less-ambitious more-scoped metadata management solutions Consolidation to central repository using parsers, bridges and generators e.g. Dovetail 1/3 failed miserably, 1/3 succeeded marginally, 1/3 demonstrated good ROI First Metadata Repositories IBM Data Dictionary 13

Where IBM Fits in the Metadata Life Cycle
InfoSphere Information Server Metadata IBM Clean, controlled metadata supply Think of metadata like water life cycle. ADD NOTES EMRs are the Ocean IDR (Integration Data Repositories) are IBM, INFA, etc Enterprise Repositories

Why Businesses Care

Metadata Business Challenges
Metadata is naturally a very complex subject which virtually all organizations address at some point and time. Some successfully and some unsuccessfully. Key challenges: Obtaining agreement on what metadata means What does metadata mean to a specific organization or division? What metadata is important to track and manage? How does each group use metadata for their particular job? Selecting the correct metadata strategy for particular business requirements How many and what kinds of silos of metadata exist today in organization? Are there revenue $ at risk, compliance issues, regulatory rules which must be addressed? Do we have the flexibility to assess the impact of changes with our current architecture? Promoting adoption of a metadata strategy and associated technology How does our approach address our different user’s needs? Is our approach one that easy to use and facilitates adoption rather than hinders it? Do we have business and technical sponsorship?

The IBM Solution: InfoSphere Information Server
IBM InfoSphere Information Server Unified Deployment Understand Cleanse Transform Deliver Discover, model, define, and govern information structure and content Standardize, merge, and correct information Combine and restructure information for new uses Synchronize, virtualize and move information for in-line delivery Key Point: The culmination of our efforts has led us to our latest platform offering – the IBM Information Server. IBM Information Server is a revolutionary new software platform from IBM that helps organizations derive more value from the complex, heterogeneous information spread across their systems. It enables organizations to integrate disparate data and deliver trusted information wherever and whenever needed, in line and in context, to specific people, applications, and processes. IBM Information Server helps business and IT personnel to collaborate to understand the meaning, structure, and content of any type of information across any sources. It provides breakthrough productivity and performance for cleansing, transforming, and moving this information consistently and securely throughout the enterprise, so it can be accessed and used in new ways to drive innovation, increase operational efficiency, and lower risk. IBM Information Server is designed to help companies leverage their information across all its sources. IBM Information Server delivers all of the functions required to integrate, enrich and deliver information you can trust for your key business initiatives. IBM Information Server allows you to: Understand all sources of information within the business, analyzing its usage, quality, and relationships and define the rules and logic necessary to transform the data and use in new business ways Cleanse it to assure its quality and consistency Transform it to provide enriched and tailored information, and; Federate it to make it accessible to people, processes, and applications All of these functions are based on a parallel processing infrastructure that provides leverage and automation across the platform. The Information Server also provides connectivity to nearly any data or content source, and the ability to deliver information through a variety of mechanisms. Underlying these functions is a unified metadata management foundation that provides seamless sharing of knowledge throughout a project lifecycle, along with a detailed understanding of what information means, where it came from, and how it is related to information in other systems. Integration logic built within IBM Information Server can easily be deployed and managed as a shared service within a SOA. IBM Information Server provides: access to the broadest range of information sources the broadest range of integration functionality, including federation, ETL, in-line transformation, replication, and event publishing the most flexibility in how these functions are used, including support for service-oriented architectures, event-driven processing, scheduled batch processing, and even standard APIs like SQL and Java. The breadth and flexibility of the platform enable it to address many types of business problems and meet the requirements of many types of projects. This optimizes the opportunities for reuse, leading to faster project cycles, better information consistency, and stronger information governance. Regarding Service-Oriented Architectures, information integration enables information to be made available as a service, publishing consistent, reusable services for information that make it easier for processes to get the information they need from across a heterogeneous landscape. Unified Metadata Management Parallel Processing Rich Connectivity to Applications, Data, and Content Delivering information you can trust

InfoSphere Information Server: Unified Metadata Management
Store Enhance Exchange Active, centrally managed repository with secure access via services layer Share and deliver relevant information across the organization Define relationships, control extensibility, and link 3rd party metadata Business Unified Metadata Management | Technical Operational Data integration metadata repository – core to InfoSphere Information Server Facilitates change management & reuse Simplifies integration Increases trust and confidence in information Enable data governance initiatives The IBM InfoSphere Information Server platform supports three primary types of metadata: business, technical and operational. Each of these metadata types can be linked together to create an end-to-end relationship between them so users can understand not just where information came from and what manipulation has been applied to it but also understand the business context of that information. Information Server approaches the management of these 3 metadata types through a common architecture approach to STORE, ENHANCE and EXCHANGE metadata (SEE). Store refers to the unified metadata foundation of the IBM InfoSphere Information Server platform which provides a single active repository to facilitate shared understanding across business and technical domains for metadata-driven integration accessed via a common services layer. Enhance refers to the metadata stored within the common repository that can be enhanced by creating linkages to third party metadata as well as implement controlled metadata extensions to capture additional business requirements. Exchange refer to the IBM InfoSphere Information Server concept of task-driven modules. Relevant metadata is exposed and shared to different users within an organization via the appropriate interface. The unified metadata management layer of Information Server was designed to support 5 goals: Core infrastructure to Information Server – all modules use the same active repository to share and exchange metadata effortlessly across user roles Promote improved management and re-use across integration projects Simplify the integration process by eliminating manual steps and maintenance of metadata Increase organization confidence by providing insight into where information came from and what it means Ensure organizations meet and exceed governance requirements including compliance and regulatory mandates Any User

InfoSphere Information Server: Unified Metadata Management
Business | Technical | Operational Store Enhance Exchange Active, centrally managed repository with secure access via services layer Define relationships, control extensibility, and link 3rd party metadata Share and deliver relevant information across the organization Role-based Tools with Integrated Metadata Business Users Subject Matter Experts Architects Data Analysts Developers DBAs A delivery and exchange mechanism is critical to exposing metadata to the different user roles on integration projects. Information Server includes task-driven modules designed for exposing metadata to user roles in context. This is what enables the Information Server platform to share relevant metadata to each of the user specific roles actively throughout the integration process. As such, there is little manual maintenance required to manage the metadata assets. For Business users, metadata is exposed via the Business Glossary and Business Glossary Anywhere modules. For Developers and technical users, metadata is exposed via development modules of Information Server such as DataStage/QualityStage. For administrators and project managers overseeing multiple tasks, metadata is exposed thru the web console and Metadata Workbench. Metadata is also exposed via a common web-based consolidated reporting layer that operates across the Information Server modules Reports can be scheduled to run automatically, and report access can be controlled per user. PDF, HTML, RTF and text versions of reports can be created and stored for historical purposes. Enhance Collaboration & Alignment

InfoSphere Business Glossary

Business Glossary Overview
Applications Users Business Glossary Business Glossary Anywhere

Applications of a Business Glossary
Simply put, a Business Glossary is created to represent the language of the business, independent of technology Three Primary Applications: Ownership Identifying stewards Managing content Collaboration Common, approved vocabulary Sharing domain expertise - Business & IT Auditability Evolution of language Centralized management All key enablers to regulatory compliance and support the IBM Data Governance Maturity Model

Business Glossary: For Every Enterprise User
Examples of most common users include: General Business Users Business Analysts Data Analysts BI and ETL Developers Data Stewards Subject Matter Experts Business Glossary Authors Business Glossary Administrators

Understanding the value of Business Metadata
In the language of the business, independent of technology Documents the business meaning of data & related technology assets Used to define a shared meaning standardize names establish responsibility, accountability, and traceability govern access share insights & experiences among users represent business hierarchies document business descriptions, examples, abbreviations and synonyms Must be managed by those that understand the meaning and importance of the information assets to the business Better aligns the efforts of IT with the goals of the business The business hierarchy may be in a different representation than the way the information is structured in the technical world. Meaning:- business descriptions including examples Dependancies: relationships between business data e.g. how to categorise a customer into a high net worth segement Ownership: Making people responsible for changes and distribution of business information – required in any data governance policy.

InfoSphere Business Glossary
Business Users Steward Create and manage business vocabulary and relationships Steward Console Features Facilitate business & IT communications by creating & managing a common business vocabulary Web based interface shared across enterprise business teams Allows creation of stewards & assignment of their responsibilities for terms & assets. Link business terms to information assets Benefits Aligns the efforts of IT with the goals of the business Provides business context to information technology assets Establishes responsibility and accountability in accordance with data governance policies This slide is organized into simple talking points designed to describe how the different Business Glossary interfaces each support a user role with specific business needs. Product Name and one line sentence description of the edition in blue. IBM InfoSphere Business Glossary is a web-based tool that enables analysts and subject-matter experts to create, manage, and share a common enterprise vocabulary and classification system. Actively connected to the InfoSphere Metadata Server, users link business terms to technical artifacts which in turn are shared with other users. The result is a common set of semantic tags used by data modelers, data analysts, business analysts, and end users. Target Users include: Stewards (typically the Subject Matter Experts and Analysts within the organization) IBM InfoSphere Business Glossary is targeted to those users or stewards responsible for administering and maintaining the Business Glossary for an organization. These users can take ownership of business metadata by importing terms from external sources, authoring and editing terms in the Web interface, classifying terms into categories, and relating terms to more technical artifacts managed by the InfoSphere Metadata Server. These users can also customize the entry page to the application and extend the underlying meta-model with custom attributes on both business categories and business terms. Business User IBM InfoSphere Business Glossary Browser provides business users with read-only access to the contents of the business glossary. All too often, tools intended to aid users in day-to-day activities fail to be adopted because of usability barriers caused by over-engineering. With this in mind, the InfoSphere Business Glossary Browser was designed based on two key principles: “simplicity lasts” and “cut right to the chase.” Business users can perform a multitude of tasks including browsing the glossary graphically; searching and viewing details for terms, categories, and stewards; viewing classified objects and sending feedback directly to the data stewards as business terms evolve thru their life cycle. Features Key functionality points to emphasis with customer when talking about capabilities of Business Glossary. Benefits Business value of using Business Glossary. The role of metadata is a fundamental underpinning of data governance. This is reflected in the Data Governance Blueprint created jointly by customers and the IBM Data Governance Council. This blueprint defines a maturity model, measuring data governance competency and providing assurance based on 11 disciplines. One of the paramount disciplines is Metadata/Business Glossary. The IBM InfoSphere Business Glossary a web-based portal created in support of data governance initiatives for the definition, management, search and exploration of business vocabulary and its rules and relationships to company data assets. To ensure data quality, only authorized data stewards can use the administrative functions within IBM InfoSphere Business Glossary to create and manage the glossary. By leveraging a common underlying repository, Metadata Server, the benefit of collaboration is enhanced through the sharing of the business terms by IBM Rational Data Architect, IBM Industry Models, InfoSphere Information Analyzer and Information Server FastTrack; creating a common set of semantic tags for reuse by data modelers, data analysts, business analysts, and end users . Product Screen Shots The Business Glossary stewardship console is web-based and allows only authorized data stewards to create and manage the glossary. Business Interface

Populate: Import IBM Industry Models
Financial Services Insurance Healthcare Available now! Telco Industrial Retail Jump-start your Business Glossary with thousands of industry standard terms from the IBM’s Industry Models. Refined by real customer usage, leverage years of experience and expertise to provide the best foundation for your glossary.

Populate: Import InfoSphere MDM for PIM
Share your product category information structures from Master Data Management with Business Glossary enabling access by all enterprise users.

Populate: Create and Assign Stewards
Create and Assign Stewards to Terms, Categories or Assets Who is responsible for this Term? Who is responsible for this Category? Who is responsible for this Asset? View Contact Information for Steward

Access: Simple to use – business browser
I didn’t need any special training to use the business glossary. It’s laid-out exactly as I expected. I can set this up so that my people immediately see the most used business hierarchies

InfoSphere Business Glossary Anywhere
User Real-time access to business glossary from any desktop application Features Click on a term and view definition in a pop-up window without loss of context or focus Intelligent matching returns best candidates in a single search Search engine for terms and categories Access steward contact information directly Security enforced via the Information Server common security layer From Any Application... Benefits Increased trust and acceptance of information by delivering definitions in context Expanded adoption of enterprise glossary outside of Information Platform technologies Improved information availability with multiple access mechanisms for electronically stored information (ESI) Pop the Definition!

Business Glossary Anywhere with Cognos
3. Select Term to view full details. 2. Candidate Terms returned. 1. Highlight Term Active and click to look up in Business Glossary This is a Cognos report in a web-browser. These screen shots show the user selecting the word “Active” on the report title and pressing shift-right click to launch a search to the Business Glossary. A pop-up screen (Business Glossary Anywhere) is displayed in the context of the same report with all candidate Terms listed. The word ACTIVE is selected to see the long definition and other potential information including: The steward The status of the Term (accepted, candidate, deprecated, etc) The long description of the Term An example of its use A Custom attribute (if created) to extend the glossary (applicable to Terms and categories only) Related IT Resources which show which objects define or classify this business Term. For example, forecast information is maintained in a particular table, in a particular database on a particular data server.

Manage: Business and IT understanding
Category: Costs Term: Tax Expense Full Name: Tax to be paid on Gross Income “The expense due to taxes …..” (John Walsh is responsible for updates. 90% reliable source) Status: CURRENT Database = DB2 Schema = NAACCT Table = DLYTRANS Column = TAXVL data type = Decimal (14,2) Derivation: SUM(TRNTXAMT) Achieve a common vocabulary between business & technical users! …an example. Shared Information Server metadata repository InfoSphere DataStage InfoSphere Business Glossary

Business Glossary Business Benefits
Enables data governance Common language supports compliance regulations such as Basel II Represent and expose business relationships Accountability and responsibility Assign stewards as single point of contact for control Improved productivity Administrators can tailor the tool to the needs of their business users Access information you need when you need it Increased collaboration Capture and share annotations between team members Greater understanding of the context of information. Greater trust in information Immediate, in-context access to terms and definitions, improves decision making

InfoSphere Metadata Workbench

Metadata Workbench Overview
Applications Users Metadata Workbench

Applications of Metadata Workbench
Simply put, Metadata Workbench allows you to understand where information came from and where it is used. Three Primary Applications: Trust Data lineage for traceability Understand relationships Productivity Assess impact of change Maximize re-use Governance Visibility and reporting Centralized management Data lineage reporting supports regulatory compliance and auditability requirements Complete set of reporting, management and administration functions for centralized control of assets All key enablers to regulatory compliance and the IBM Data Governance Maturity Model

Metadata Workbench: For Projects leads & Architects
Examples of most common users include: Project Leaders Data Architects and Modelers DataStage Developers BI and OLAP Developers Business Subject Matter Experts Business Users and Compliance Officers Project Leaders Assists with project planning and reviews with end users Data Architects and Modelers Captures source system metadata from Profiling Share source and target data models DataStage Developers Reuse Source and Target table definitions Captures operational information Assists with impact assessment and data lineage BI and OLAP Developers Provides target table, data lineage and business definitions Business Subject Matter Experts Captures business entities and relates them to physical data Business Users and Compliance Officers Provides knowledge of where data came from and what it means

InfoSphere Metadata Workbench
IT Developers Administrators Project Managers & DBAs Exploration of assets generated and used by Information Server Features Cross-suite view and administration of Information Server assets Explore, analyze and manage metadata graphically and textually Comprehensive reporting on data lineage Assess dependencies across Information Server & 3rd party tools Perform searches and build customized queries This slide is organized into simple talking points designed to describe how Metadata Workbench supports user roles with specific needs. Product Name and one line sentence description of the edition in blue. IBM Metadata Workbench provides visual Web-based exploration of information assets generated and used by the IBM Information Server. It improves business trust in information and increases IT responsiveness by tracing the data lineage path. It visually depicts the relationships between modeling tools, IBM Information Server modules and business intelligence applications. Target User: IT Developers, System Administrators, Project Managers and Database Administrators IBM Metadata Workbench provides Developers, Administrators and Project Managers the insight into what happens to the data as it moves from modeling tools through the IBM Information Server to business intelligence applications to better manage data integration environments. Features Key functionality points to emphasis with customer when talking about capabilities of Metadata Workbench. Benefits Business value of using Metadata Workbench: - Increase trust Complete data lineage traceability to understand where data came from, what happened to it along the way and what its related to Define linkages between business terms and IT assets to understand relationships - Improve productivity Automated metadata reconciliation across design and operational metadata assets Visual cross-tool impact analysis to assess changes before they occur Simple and advanced search and ad-hoc query to manage IBM Information Server assets - Strengthen governance programs Data lineage reporting supports regulatory compliance and auditability requirements Complete set of reporting, management and administration functions for centralized control of assets Product Screen Shots Metadata Workbench exposes the linkages between metadata assets and how they are related to one another. This screen shot shows the metamodel of information which is captured and stored within Information Server for the data modeling tool Erwin. This information can then be connected to other Information Server metadata such as business terms and ETL jobs to provide deeper context and meaning. Benefits Assess and mitigate change management risk Support compliance and governance initiatives requiring data lineage (eg Sarbanes-Oxley, Basel II) Proactively manage and administer complex data integration environments

Information Server: Optimizing Application Development
Import Industry Data Models Exchange Data Structures Services Oriented Architecture Rational Data Architect Information Services Director Link Populate Deploy Common Enterprise Vocabulary Transform and Cleanse Search and Profile Source Data Map Sources to Target Model Link Define Create Integration projects have many user roles involved. These user roles work independently performing their specific task in their specific interface. Interaction and collaboration between these users and tasks, however, is extremely important to maximize team productivity and share knowledge. The IBM Information Server approach enables this goal through the industry-leading architecture of a unified integration metadata repository. IBM Information Server was designed to be fully flexible to integrate with existing organizational data integration processes. The following diagram illustrates just one possibility of deployment options for the platform and how an organization can leverage the architecture to maximize application development activities via the unified metadata repository. Beginning with the IBM Industry Models, customers can take advantage of IBM’s years of experience with developing and refining industry-specific vertical models. These models are available in Rational Data Architect and include a glossary, logical and physical model. The glossary models contains thousands of industry standard terms which can be used to pre-populate the WebSphere Business Glossary contents. The physical data model information can be imported from Rational Data Architect and shared with all the modules of Information Server to design the integration processes to populate it. Customers have the option to modify and extend the Industry Models to customize for their particular business’ requirements. Once the data models are defined and the business context is created, analysts now need to profile and understand the source systems which will be used to populate the new target data model. During the profiling process, analysts can also create and define new business terms to describe the data sources as necessary if these business definitions were not previously defined by the Industry Models. Once the models, business context and profiling steps have been completed, the analyst is now ready to create the mapping specifications which are the input into the building of the ETL jobs for the new application. Utilizing the business context and profiling results, the analyst defines the specific transformation rules necessary to convert the data sources into the correct format for the Industry Data model target. During this process, the analyst not only defines these business transformation rules but also can define the relationship between the business terms and their representation in physical structures. These relationships can then be published to any user for consumption and better understanding of the asset relationships. The creation of the business specification serves as not only historical documentation but also as direct input into the generation of DataStage ETL jobs. The business rules defined are included in the ETL job directly as either code or as annotated to-do tasks for the developer to complete and make the job ready for deployment into production. Once the ETL job is ready, the developer can also decide to deploy the same batch process as a service-oriented architecture component using WebSphere Information Services Director. Throughout this whole process, metadata is generated and maintained as a natural consequence of using the Information Server modules. This is what enables the Information Server platform to share relevant metadata to each of the user specific roles throughout the entire integration process. Because of this unique architecture, there is little manual maintenance required to manage the metadata assets. Only third party metadata requires administration tasks such as defining the relationships to the Information Server metadata objects. This is unlike alternative solutions which use passive metadata and thus require extensive maintenance to manage the metadata and keep the information up-to-date. Administrators and developers who need to have a cross-view of the Information Server and third party metadata assets utilize Metadata Workbench to query, analyze and report on this information from the common repository. DataStage and QualityStage Business Glossary Information Analyzer FastTrack Share Share Share Share Unified metadata for active administration, management and reporting Metadata Workbench offers complete visibility and control of metadata

Metadata Workbench Integration Asset Categories
Business (Business Glossary) Profiling (Information Analyzer) ETL and Quality Design (DataStage/QualityStage) SOA (Information Services Director) Users, Security, Reporting (Web Console interface) Asset Relationships (Metadata Workbench) Specifications (FastTrack) ETL runtime stats (DataStage/QualityStage) BI and OLAP design (3rd party BI vendors) Physical schemas (3rd party CASE, modeling, RDBMS vendors)

Metadata Workbench Feature Overview
ADMINISTER Manage Integration Assets to enable in-depth analysis Assign security roles Link together multiple viewpoints of design assets from ETL, business, BI and modeling with operational metadata Edit names and descriptions of Integration Assets Access runtime information to enrich reporting Import export manager for 3rd party integration EXPLORE Explore key Integration Assets: Jobs, Reports, Databases, Models, Terms, Stewards, Systems, Specifications, Quality Rules Easy navigation of key Integration Assets Simple and advanced search Integrated cross-view of Information Server and 3rd party assets Graphical view of Asset Relationships ANALYZE Trace data movement to and from databases, jobs and reports for full lineage Understand business meaning of columns, tables, and other assets Assess the impact of change across Integration assets Robust query builder Analyze dependencies and relationships between key Integration assets, Business Intelligence Reports and data models

Explore: Homepage The homepage offers direct access to Discover key Information Assets, Find any Information Asset or execute Queries.

Explore: Graphically Browse
Browse Database, Files, DataStage/QualityStage Jobs, Table Definitions or other objects within their Folder and Project placement. Flexible browsing capabilities enable users to find information easier and better manage complex environments

Explore: Textually Search Where and how do I find out details about assets?
Full search and explore capabilities View in textual list or graphical format Search multiple ways including by asset class, name or by asset property

Analyze: Data Lineage Where does a Field of Data in this Report Come From?
Import & Browse Full BI Report Metadata Navigate through report attributes Visually navigate through data lineage across tools Combines operational & design viewpoint IBM Information Server Source Tables

Analyze: Cross-Tool Impact Analysis Where Happens if I Change this Column?
Show complete change impact in graphical or list form Includes impact on reports in BI tools Visually navigate through impacted objects across tools Allows impact analysis on any object type

Analyze: Business Meaning What does this Field Mean?
Show relationships between business terms, data model entities, and technical and report fields Provides cross-tool mapping of business terms Allows field meaning to be understood Allows business term relationships to be understood

Metadata Workbench Business Benefit
Good exploitation and coordination of metadata across tools is very compelling Tool Integration & metadata collaboration saves time and money, and improves Quality of results. An organization’s ability to Govern Data is significantly improved. A organization is better able to Manage Change, more Agile. A Shared, Common, Vocabulary saves time and effort, helps development, and makes data more accessible & more understandable. The key is making the metadata capture and share process effortless

Third Party Metadata

Import Export Manager for Information Server
IT Developers Administrators Import Export Manager for Information Server Expand visibility of metadata touch-points in support of data integration projects Features Security enforced via Information Server common security layer as well as the 3rd party application security layer Metadata Bridges interchange metadata with each specific application a consist of a model, a decoder, and an encoder which require no coding. Import capabilities for 3rd party BI tools (Cognos, Business Objects, MicroStrategy), data modeling tools (ERwin, RDA) and databases (ODBC connections to all major RDBMS) Support a variety of import formats including XMI, XML, UML, CWM and CSV metadata exchange formats The import export manager is used to import metadata from 3rd party products such as modeling and business intelligence products to expand on the critical touchpoints between these applications and the data integration process. Primary sources are external to IBM however we also support other IBM technologies as well such as RDA. Benefits No manual interface coding required for 3rd party metadata visibility Visibility of data modeling to ETL to report layer minimizes risks of overlooking critical dependencies Leverage common metadata exchange environment for application development consistency

Meta Integration Technologies, Inc. (MITI)
OEM of 3rd party metadata bridges for import - More than a dozen major vendors OEM MITI bridges* IBM and MITI jointly certified and tested bridges Additional (MITI) bridges - Many bridges are available “as-is” and can be easily enabled post installation. OEM Partners Adaptive Cognos IBM Metamatrix Sypherlink Business Objects (SAP) Composite Informatica SAS CA Embarcardero Kalido SchemaLogic Other partners: ASG Hummingbird (Opentext) Telelogic (an IBM company) Cosort Oracle Microsoft Sybase *

Fully Supported Bridges
As of 8.1 release IBM and MITI both perform QA testing independently

Bridges Provided As-Is
As of 8.1 release BI Reports, OLAP and Physical Schemas MITI performs QA testing independently

Bridges Provided As-Is (continued)
As of 8.1 release MITI performs QA testing independently

The Areas of Metadata Business Glossary & IS Users
ETL Operational Metadata (Job Run Information) BI Reports Physical Schemas ETL Job Design

The Areas of Metadata Connected
Business Glossary & IS Users ETL Operational Metadata (Job Run Information) BI Reports Physical Schemas ETL Job Design

Case Studies

APPROVED FOR EXTERNAL USE
Melbourne Health (MH) Advances clinical research with next generation genetic information infrastructure Challenge Melbourne Health’s legacy I/T infrastructure had limited ability to access and share medical data. Their research data, diagnostic results, and clinical documentation were stored in mixed formats across mixed environments: MS SQL Server, MS Excel and MS Access They managed disparate information silos without being able to gain insight and meaning from them – key to unlocking the advancement of medical research and improving treatments. Benefits The BioGrid Austalia virtual platform currently links research databases across 16 organizations. A world’s first and milestone achievement for the research community. BioGrid provides access to a comprehensive set of research data which ultimately leads to improved research accuracy and treatment quality, by: Linking data across disparate clinical and biomedical data sources within and across institutions Linking patient/subject records by assigning data Unique Subject Identifiers (USI’s) Providing a uniform interface and access mechanism BioGrid is a building block of a larger national grid initiative to share research data for the priority disease group of cancer. Solution Melbourne Health is deploying their BioGrid Australia project in 3 phases from in collaboration with hospitals and research institutes. The project utilizes the following key software: IBM DB2 IBM InfoSphere Information Server: InfoSphere Business Glossary, IBM InfoSphere DataStage, and IBM InfoSphere Federation Server SAS Enterprise Business Intelligence Server, SAS Web Report Studio, and SAS Enterprise Guide Customer Profile Customer: Melbourne Health Industry: Healthcare and Research Location: Australia No. Employees: Employs more than 7000 staff across its services and manages more than 1000 beds in the acute, sub-acute and community sectors Main Competitors: None (Free Public Service) Annual Revenue: AUD $651 million Core Business: Clinical research institution URL: Company Background Melbourne Health is a major public healthcare provider and internationally-recognized research hospital based in Victoria, Australia. Melbourne Health is using information integration technology to implement BioGrid Australia, an information infrastructure that will link medical data across institutions and help advance cancer research through collaboration. Challenges - Melbourne Health’s legacy information technology infrastructure limited its ability to access and share medical data. - They managed disparate information silos without being able to gain insight and meaning from them – key to unlocking the advancement of medical research and improving treatments. - Their research data, diagnostic results, and clinical documentation were stored in mixed formats across mixed environments: MS SQL Server, MS Excel and MS Access The goal of their research project is to help improve disease management at the individual level by improving treatment strategies and reducing morbidity rates. Technical considerations include increasing the data sample sizes, expanding on research collaboration capabilities, and linking specialist databases to improve research process. Solution The BioGrid platform provides clinical researchers access to data across disparate existing databases across multiple disease types at multiple institutions, is co-located in a single repository, and can be linked with publicly available research and genetic profiling data. Institute specific data is loaded into institute specific databases nightly using DataStage for ETL. Federation Server is then used to query the federated repositories. Business Glossary is used to publish the available data content in the repository. SAS is used as the front-end to initiate statistical queries for analysis. Benefits - The BioGrid virtual platform currently links research databases across 16 organizations – a world’s first and milestone achievement for the research community. - BioGrid provides access to a comprehensive set of research data which ultimately leads to improved research accuracy and treatment quality, by: 1) linking data across disparate clinical and biomedical data sources within and across institutions; 2) linking patient/subject records by assigning data Unique Subject Identifiers (USI’s); 3) providing a uniform interface and access mechanism - BioGrid is a building block of a larger national grid initiative to share research data for the priority disease group of cancer. Products A wide range of research products are utilized in the implementation of the On-Demand Clinical Research project for BioGrid: IBM Products IBM DB2 IBM DB2 Data Warehouse IBM InfoSphere Business Glossary IBM InfoSphere Business Glossary Browser IBM InfoSphere DataStage IBM InfoSphere Federation Server Other Products ASP Microsoft Access Microsoft Excel Microsoft Internet Explorer Microsoft .NET Microsoft SQL Server and SSIS Oracle SAS Enterprise Business Intelligence Server SAS Enterprise Guide SAS Web Report Studio Sun Java For More Information IBM Case Study: None Available IBM Press Release: None Available IBM Reference Database: None Available APPROVED FOR EXTERNAL USE

Melbourne Health Screenshot: BioGrid Australia
Customer in Production – Melbourne Health is using Business Glossary and Business Glossary browser in a project called BioGrid Australia. The BioGrid platform provides clinical researchers access to data from disparate existing databases across multiple disease types at multiple institutions, co-located in a virtual repository, which can be linked with publicly available research and genetic profiling data. It is ethically approved at all participating sites. The Business Glossary describes the contents of each database currently integrated in BioGrid. The data elements are stored in this Glossary as Terms which you can search for by keyword. Also, by selecting Browse you can drill down through the database hierarchy (stored as Categories) to get to individual Terms. Each Term is non-redundant and is linked to each occurrence of the Terms in all databases. eg; The Term "Date of Birth" appears in the Glossary as a single term but is linked to each occurrence of this in all databases. APPROVED FOR EXTERNAL USE

Melbourne Health Client Quotes
"At this stage, our researchers are already able to carry out previously impossible studies to stratify patients by genetic types. Researchers are already seeing the initial fruits of collaboration, which will stimulate further cooperation as BioGrid is expanded. Ultimately, BioGrid equips us with capabilities to accelerate clinical research through collaboration.“ – Robert Merriel, Director of Business Development, Melbourne Health “IBM InfoSphere Information Server is a key toolset for our multi-phase, multi-year Molecular Medicine Informatics Model project. We are leveraging the capabilities of InfoSphere Business Glossary, a module of IBM InfoSphere Information Server, to share information about the business and technical metadata stored in our research repositories so that users can have the relevant information at their fingertips.” APPROVED FOR EXTERNAL USE

Major Insurance Company in the UK – Metadata Workbench
Improved Trust of BI Reporting Through Lineage and Impact Analysis Business challenge Business intelligence reports are used to make some of this company's important business decisions. Understanding how these reports are created to ensure validity is a must have requirement. Today, when errors or questions about reports come up, a manual , time consuming, and error prone discovery process is completed by DataStage Developers to track the lineage of the BI report field in question. Solution As a long-time DataStage user, this Customer saw immediate value in leveraging the Metadata Workbench to better understand lineage and impact analysis across thousands of DataStage jobs. The capabilities of the Metadata Workbench could be used in place of tedious and error prone documentation currently used to trouble shoot errors or questions about business intelligence reports. In addition, the creators of BI reports saw value in accessing the Metadata Workbench on their own to do their own research about reports in question. Benefits Improved TRUST in BI reports Time saved to do manual troubleshooting of errors in BI reports Automated the documentation process “The Metadata Workbench helps this customer prove through lineage and impact analysis that their BI reports contain Trusted Information.” —Desiree Ruiz-Amirgholi IBM Lab Advocate and Product Manager for the Metadata Workbench

Summary

Information Server Metadata Differentiators
Auditing and traceability - single data integration repository Enhanced collaboration – seamlessly share information across user roles Simplified development – reduce manual steps with integrated platform development Streamlined information access – promote enterprise adoption through ease of use Strategic integration with IBM brands – roadmap for success

InfoSphere Metadata Comprehensive Deck

Similar presentations

Presentation on theme: "InfoSphere Metadata Comprehensive Deck"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

InfoSphere Metadata Comprehensive Deck

Similar presentations

Presentation on theme: "InfoSphere Metadata Comprehensive Deck"— Presentation transcript:

Similar presentations

About project

Feedback