Download presentation
Presentation is loading. Please wait.
Published byAbigayle Todd Modified over 6 years ago
1
Make the POWER i server a Data Warehouse - sharing DB2 data with other technologies
Mariusz Gieparda Manager, Sales Engineering - EMEA.
2
Remember those days… were the AS/400 was the Data Warehouse?
3
Today’s Businesses Have Multiple Databases
83% 10% 8% Source: Vision Solutions 2017 State of Resilience Report Multiple databases are the norm Merger or acquisition Choice of multiple apps or databases for best of breed solutions Combination of legacy and new databases Multi-organization supply chain As you are surely aware, it is the norm today for organizations to have two or more databases involved in running of their business. There are a number of clear reasons for multi-database infrastructures. Merger and acquisition events are a common cause of multiple databases in an organization. Business goals drive mergers. Then IT teams are tasked with getting the underlying systems of the two formerly separate companies to communicate, consolidate, or integrate. In other instances multiple databases or applications with different underlying databases are chosen by design. For example, a best of breed accounting application may be selected separately from a best of breed operations management system, and those applications may run on a completely different databases. At the time the packages were selected, the need for those two systems to eventually share data may not have been envisioned. But things change over time and those databases must now share information seamlessly. Similarly, companies choose an application or database at some point in their history, and then determine it’s time for a technology change or refresh. The old and new systems may coexist for a time, requiring data sharing, until such time as a consolidation or migration can occur. And finally we see companies interacting with multiple databases as part of a multi-organization supply chain. Your organization may run your inventory management system on DB2, but… your suppliers, and those you supply, may require the data in a completely different format – for a different operating system or database platform. And so, businesses find themselves with a mix of databases that run on different hardware platforms, operating systems, and database systems. Regardless of how businesses end with multiple databases, and what those databases are, IT still needs to make the business operate as efficiently as possible using all the systems in your environment. What does your database infrastructure look like today?
4
Traditional Methods for Sharing Data
Direct network access Reporting on production servers across the network during business hours Issue: Negatively impacts network and database performance – resulting in user complaints! Off-hours reports and extractions Run reports off-hours or perform nightly ETL processes to move data to a reporting server Issue: Business operates on aging data until next extraction Issue: Difficult to find acceptable time to perform an extraction ETL (Extract-Transform-Load) Processes FTP/SCP/file transfer processes or Manual scripts or Backup/restore or In-house tools Issue: Periodic, not real-time, delivery of data Issue: Labor intensive to create processes and tools Issue: Expensive to develop and maintain Issue: Prone to errors Now shifting gears to the pitfalls of traditional data sharing methods. As we talk to prospects and survey the market at Vision Solutions, we see three levels of approaches to data sharing. The first is sort of a “non-sharing” method. Companies start by attempting to allow access to databases across the network for access or reporting at any point during the day. This often results in performance degradation for the network and database, and ultimately in user complaints. Which leads business systems teams to the next level, which is running reports and extractions off hours to ease the network and database load during business hours. However, this approach suffers from a couple of obvious issues. The first being trouble finding a time for the extraction which is not business hours in today’s 24x7 global business world. The second, even more serious, issue is attempting to run a business on data or reports that became obsolete just as soon as they were pulled. Organizations will often then create manual processes, scripts and in-house tools to either support their off-hours reporting or augment it. We’ll talk more about this in a minute, but this approach clearly suffers from a high price tag associated with labor to create, maintain and troubleshoot those tools and processes.
5
In-House ETL Scripts and Processes Are Not Free
Upfront development costs Development of code to perform database extraction, transformation, and load Additional requirements for additional pairings, schemas, etc. Test system expenses Hardware and storage resources Database licenses for test systems Add-on products, e.g. gateways Maintenance costs Ongoing enhancements for altered schemas, additional platforms Testing new database and OS releases Cross training and documentation to reduce turnover risk Lost opportunity costs for other initiatives Let’s talk just a bit more about the cost of developing an in-house ETL style solution. The most obvious cost is in development resource costs. Distinct development work is required for each source database extraction, all transformations, and for each target load. The more databases, schemas, transformations and targets… the bigger the project and the higher the cost of the development. Add requirements for more pairings, more data types… and the price just goes up. The secondary cost of any development project is the expense of testing. I think we all know that this particular cost is sometimes shortcut, often to the detriment of the ultimate result, but well-managed projects will include a solid test effort. For a data sharing project that will entail additional hardware resources to host the test databases, additional database licenses for all sources and targets, additional storage for these systems and databases, and any add-on product (such as gateways) you leverage in your design. And once the initial development and test effort is underway, you must manage the maintenance costs. Ongoing enhancement requests for additional databases, additional data types, altered schemas, additional transformations, and functionality add up to an ongoing stream of development and test expenses. And, if there are issues… more effort to isolate, fix and test. And finally, you must invest time in making sure the tools are well documented, staff is cross trained in the maintenance of the tool, and you are protected from having the know-how to maintain the tool walk out the door in the event of staff turnover. And the final, perhaps less quantifiable area of cost is that of opportunity costs. If your IT staff is spending time writing and maintaining an ETL solution… what other priorities are not being addressed? What initiatives are not being addressed that could provide additional opportunities for your business?
6
Use Case: IBM I Data Warehouse
Data Source Data Source Data Source Data Source Data Source Data Source MS SQL MS SQL Linux Oracle Linux Oracle IBM System i DB2 Linux DB2 Real time CDC replication with transformation Data In this example of more sophisticated data sharing use case, Pinnacle entertainment decided to create a centralized data warehouse to leverage customer activity and information from their distributed sites. This allowed them to create and evaluate programs based upon customer behavior to determine effective national marketing campaigns. Again, as with many customers their tools of choice for their data analytics and reporting happened to reside on Windows where they centralized, while their satelite sites all run DB2 on System i. Note that the source databases were not required to be all the same variety. It just happens that this was the environment at Pinnacle. The actual sources systems could be a mix of multiple databases and operating systems. Sample screen Single Data Warehouse Database IBM System i DB2 Business intelligence
7
Supports a Broad Range of Platforms
Leading Operating Systems Leading Databases IBM i IBM AIX HP-UX Solaris IBM Linux on Power Linux SUSE Enterprise Linux Red Hat Enterprise Microsoft Windows, including Microsoft Azure IBM DB2 for i IBM DB2 for LUW IBM Informix Oracle Oracle RAC MySQL* Microsoft SQL Server Teradata* Sybase Postgres* (in Q1-2018) I mentioned that MIMIX Share supports a variety of leading operating systems and databases. This slide provides the list of supported platforms. Not every database is supported on every operating system (for example Microsoft SQL Server does not run on IBM i), but MIMIX Share supports those combinations of OS and database that are supported by their respective platforms. Do note that Teradata and Oracle RAC databases as well as Linux on Power servers are currently supported only as target operating systems. * Target only
8
Real-Time Replication High-Level Architecture
How does MIMIX Share implement data sharing? This is an overview of the process. MIMIX Share software is installed on each the source and target database server. MIMIX Share uses a point-to-point architecture, which means that data is moved directly from the source to the target database across the network without any intervening hops required for processing. This maximizes performance, simplicity and flexibility, and eliminates the need for any intermediate server. Communication between databases uses JDBC drivers. Configuration and administration is done from an easy to use GUI central administrative console (which is also called the Director). The admin console runs on Wintel, and is used to define the data sharing rules. The administrator essentially defines the data sharing rules. These definitions are referred to as models. When Rep1 distributes these models from the Admin console to the Source and Target, it also converts these model definitions to XML format. No user XML expertise is necessary…all edits, updates, and maintenance of the models is done via the Admin GUI. The model definitions are stored into what is referred to as the EDMM (Enterprise Data Movement Model). This configuration information is stored both on the source and the target servers in a MIMIX Share internal metabase. The EDMM contains components that describe the data on each database server (such as the source/target hostnames, database names, data schemas, mappings, and data transformation rules). It also stores other information on how to replicate the data (such as scheduled versus real-time replication). The change data capture component of MIMIX Share recognizes changes as they occur on the source at the column/row level, and uses the model rules to transform and copy the data to the target as specified. Users can also administer and monitor MIMIX Share via a remote console interface from their desktop. A command line interface is also available. Note: JDBC is an API for the Java programming language that defines how a client may access a database. It provides methods for querying and updating data in a database. *
9
Change Data Capture (CDC) for Real-Time Replication
Change Data Capture (CDC) captures database changes immediately and quickly replicates them to another database(s) in Real-Time Only changed data is replicated to minimize bandwidth usage Automatically extracts, transforms and loads data into target database without manual intervention or scripting Real-Time Replication with Transformation Change Data Capture (CDC) Conflict Resolution, Collision Monitoring, Tracking and Auditing Source Database Target MIMIX Share is built upon Vision’s data replication technology, which has been proven for over 20 years by thousands and thousands of customers. At the source database, Share’s reliable change data capture technology captures changes as they occur, and those changes are replicated in real time across any IP network to the target database. Any required transformations are performed during that transfer. At the target database, changes are applied and a number of measures are taken to ensure that the data is always sound. Built-in collision monitoring and conflict resolution ensure the integrity of the data. Tracking and auditing alert administrators to any instance where a transaction could not be applied, so that they can reapply as appropriate. Whether uni-directional or bi-directional, Share makes sure the data is always sound. Keep in mind that Share is only sending changes as they occur in real time. So data trickles across the network. This makes Share very LAN/WAN friendly because it never sends large data transfers or uses a great amount of bandwidth at any time. The benefit to your business is that your mission critical information is captured, transformed, and replicated to the places where it is needed as soon as it is changed. Access to data that is current in real time gives you the ability to make better informed decisions.
10
IBM i Log-Based Data Capture
Queue Source DBMS Retrieve/Transform/Send Apply Changed Data 1 3 4 Journal Change Selector 2 Use of Journal eliminates the need for invasive actions on the DBMS. Selective extracts from the logs and a defined queue space ensures data integrity. Transformation in many cases can be done off box to reduce impact to production. The apply process returns acknowledgment to queue to complete pseudo two-phase commit. With log-based data capture the Change Selector recognizes changes as they occur and pulls the necessary information for replication and sharing from the database log and places it in the queue. Only committed transactions are replicated/shared to the target retaining write-order consistency, and no impact to production processing and low latency in applying changes to the queue. Target DBMS
11
Transform the Data Exactly HOW You Need To
Transforms data into useful information 80+ built-in transformation methods Field transformations, such as: DECIMAL(5,2) nulltostring(ZIP_CODE,'00000') Table transformation, such as: Column merging Column splitting Creating derived columns Custom lookup tables Create custom data transformations using powerful Java scripting interface Data Transformation Misura Prodotto Codice Quantita Size Product Code Quantity SHO34 Shoe 9.5 10 Brown Scarpe SC953 44 Marrone Boston Rome Color Colore Share exactly HOW you need to: As we discussed earlier, data can be changed “on the fly” when moving from source to target as specified by your data transformation model. There are over 60 built-in data transformation functions with MIMIX Share that can be used, and a few samples are just listed on this slide under the “Fields” heading. Table functions also include merging, splitting, or creating new derived columns for the target. Custom lookup tables can also be deployed. For example, a zip code could be transformed by MIMIX Share to a State value by defining a transformation function that uses a lookup table which contains all zip-codes by State. For even more specialized processing, there is a powerful Java scripting language that allows for any custom transformations needed. *
12
Guarantees Information Accuracy
Ensures ongoing integrity Changes collected in queue on source Moved to target only after committed on source Ensures write-order-consistency retained Queues retained until successfully applied No database table locking Ensures failure integrity Automatically detects communications errors Automatically recovers the connection and processes Alerts administrator No data is lost MIMIX Share ensures: ONGOING Integrity: Write order consistency for transaction sequences, as well as for sequences within a transaction are retained. FAILURE integrity in the event of a connection failure by doing automated recovery and resumption of transmission from where the last transmittal left off, ensuring no data loss. Also, MIMIX Share provides automated MONITORING so that ongoing health and any service interruptions are captured, and notifications sent. A GUI monitor exists on the admin console, and there is also a web browser interface. Alerts can be captured and sent to staff via SMTP alerts. * SMTP Alerting
13
Accurate Tracking & Data Auditing
Detects and resolves conflicts Maintains data integrity Model verification Validates date movement model Model Versioning Audit Journal Mapping tracks all updates and changes Records Before and after values for every column Type of transaction Type of sending DBMS Table name User name Transaction information Records to flat file or to database table Can assist with SOX, HIPPA , GDPR audit requirements MIMIX Share automatically resolves any data collisions so your data stays in sync and accurate at all times. Also, when data movement models are created or updated, MIMIX Share does validation and logic checking before allowing activation, and will issue warnings. Once models are activated, the status of the active and previous models are tracked by versioning. You can optionally turn on journal mapping to enable recording of the details of any update activity on a particular table. Journal mapping captures the “before” and “after” values from an update statement to record what changed. Additional values recorded include the type of transaction, the table_name, user_name and other values. Auditing can write to a flat file, or to a database table. Most shops are interested in this feature to meet SOX, HIPPA or other audit requirements. *
14
Lets You Share Exactly WHAT You Need
Filters determine what data gets moved Select specific column and table eg. Create an new column on target Select specific rows and table eg. Gate condition, split to different target DB Share Exactly WHAT you need: Only the data you want is moved from source to target—you do not need to share entire tables. You can specify what column or columns are to be shared to the target, you do not need to share every column—but you can be very selective about what you want to replicate. You can also concatenate columns from the source to the target, or create a new target column as we saw in the previous example. You can also specify what rows you want to share to a target. It is possible to only share rows that meet certain conditions, such as to share only rows of data for which the Date is after January 2008, or perhaps you wish to share rows by geographic area….so you could share rows from Illinois to one target database, and rows from California to a different target database. *
15
Mapping Columns Example
Target Server Source Server Database Server DB2 Database Server MS SQL Server customer_master (SQL table) table mapping CUSTPF Column Name Data Type Column Name Data Type CUNUM Numeric (10) Customer number Numeric (10) CUCLM Numeric (10,2) Customer name Alpha-numeric (10) CUNAM Alpha-numeric (20) Customer address line 1 Alpha-numeric (25) Mapping Columns: In the simplest case the source and target database table could have the same name and same column headings. If this is the case, then Share will automatically associate the mappings for the source table name and columns with same-named table and columns on the target. You could then further edit this model as desired. In this mapping example displayed here, an administrator is sharing columns from DB2 into columns on a target MS SQL Server database—so in this case the table names are not the same, nor are the column names. MIMIX Share can easily address this scenario also. As part of the mapping, the Column Names from the DB2 table are easily mapped to the MS SQL Server column names to be used on the target, even though the names differ. The field lengths and/or types could also differ, and MIMIX Share can handle that. * CUAD1 Alpha-numeric (25) Customer address line 2 Alpha-numeric (25) CUAD2 Alpha-numeric (25) Customer address line 3 Alpha-numeric (25) CUAD3 Alpha-numeric (25) Customer address line 4 Alpha-numeric (25) CUAD4 Alpha-numeric (25) Customer address line 5 Alpha-numeric (25) CUTEL Numeric (10) Customer telephone Numeric (10) Customer credit limit Numeric (10,2) column mappings
16
Additional Replication Options
One Way Two Way Choose a topology or combine them to meet your data sharing needs Distribute Bi-Directional As noted earlier, MIMIX Share is extremely flexible and powerful with how it does data sharing at the table, column and row level. This slide illustrates the flexibility of MIMIX Share at the higher, architectural level. A wide variety of topologies are supported, as are combinations of these topologies. The simplest case is one-way source-to-target sharing, but this can be expanded to include multiple targets to deploy distributed sharing, or the opposite where multiple databases feed into a centralized target for a consolidated solution. Two way and bi-directional replication is also supported, as well as multi-hop, cascading configurations. And these replication architectures can be combined to meet your unique needs. * Consolidate Cascade
17
Other Use Cases
18
Use Case: Offload Reporting from Production Database
Retail Company Many cost effective tools available on MS SQL server platform for query reports Query reports Production System Offload Query System Real time CDC replication with transformation Data Warehouse load IBM System i DB2 Lawson M3 (Movex) MS SQL Server In this very simple example of offloading data from the production server for the purpose of reporting without impacting production performance, Red Wing Shoes, a major shoe manufacturer wanted to offload queries and reporting from their centralized production database. Adhoc queries and reporting were chewing up too many cycles on production. Like many customers, the reporting and analytical tools that they preferred all were Windows based, so they replicated data in real-time from Production DB2 to a MS SQL database using a simple one-way, source to target configuration. While many customers deploy snapshots to do a similar function to offload queries, with MIMIX Share the target database remains 100% accurate with production at all times—which is better than snapshots which are only point-in-time copies. * Data is already partially ‘scrubbed’ and available for loading data warehouses and data marts without performance impact on production system Reduce CPU and I/O overhead on production system improve user response times
19
Use Case: Database Migration
Manufacturing Company Real time CDC replication with transformation Old System New System IBM i DB2 for i JDE (standard) IBM i DB2 for i JDE (Unicode) This use case highlights the use of MIMIX Share for a migration project by a very large insurance company that moved from a DB2 based system on IBM i to a new system that ran using Oracle on Sun. Because the migration project spanned a number of months and they wished to slowly cutover their servers, they implemented MIMIX Share in an active-active architecture. Both databases replicated to the other to keep old and new systems current and in sync. When all servers were finally migrated, the final cutover was done. All cutovers required near-zero downtime. And since the migration was between different platforms using databases with completely different schema, MIMIX Share transformations were used to fit the data to the target system.
20
Use Case: Database Replatforming
Insurance Company Transformation between different OS and database platforms Users are moved to new server in phases over a period of time Old System New System IBM i DB2 Sun Oracle RAC This use case highlights the use of MIMIX Share for a migration project by a very large insurance company that moved from a DB2 based system on IBM i to a new system that ran using Oracle on Sun. Because the migration project spanned a number of months and they wished to slowly cutover their servers, they implemented MIMIX Share in an active-active architecture. Both databases replicated to the other to keep old and new systems current and in sync. When all servers were finally migrated, the final cutover was done. All cutovers required near-zero downtime. And since the migration was between different platforms using databases with completely different schema, MIMIX Share transformations were used to fit the data to the target system. Two-way Active-Passive replication to enable application server switching Near-zero downtime for cutover to new systems
21
Use Case Application Integration
On-Line Banking Microsoft SQL Server IBM System i DB2 2 1. Customers enter new banking transactions on line. They get captured in SQL Server 1 replicates the transactions in real time to DB2/400 Bi-directional replication In this example, a MIMIX Share is used for a baking project that employed bi-directional data sharing for an online banking application. End users execute transactions against their bank accounts via a web-based application that runs to a MS SQL Server database. The transactions are shared in real-time from SQL Server to the centralized banking database on System I DB2. The data is immediately applied to the central database, then the result of the transaction is propagated back down to the SQL Server database in real-time so that the customer can immediately view the result of their transaction on their bank account from their web session. * 5. Customers view processed transactions on-line 5 3. A back-office batch application processes incoming transactions and updates data. 3 4. replicates the processed transactions back to SQL Server in real time. 4
22
Additional Use Cases Outside Vendor ERP SYSTEM eCOMMERCE & WEB PORTALS
Customer Orders Payment Details Product Catalogue Price List eCOMMERCE & WEB PORTALS DR / BACKUP TEST & AUDIT ENVIRONMENT DATA EXCHANGE WITH OUTSIDE VENDOR (FLAT FILE) Outside Vendor This scenario is theoretical, but it could definitely be implemented, and it is designed to show the flexibility of MIMIX Share. Here we see Share sending data from a central ERP system to an eCommerce site, both a test and audit server, to a disaster recovery backup server, as well as to a flat file for exchange with an outside vendor. All possible use cases with Share.
23
24
24
Make the POWER i server a Data Warehouse - sharing DB2 data with other technologies
Mariusz Gieparda Manager, Sales Engineering - EMEA.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.