Base Content Slide Larry Ellison CEO, Oracle "By having all of the pieces in the stack—from the silicon all the way up to the application—we'll be able to deliver systems that run faster, are fault-tolerant, are highly secure—much more secure, much more performance, much more cost-effective, much easier to use than we ever could have delivered by simply delivering components." With our recent Sun acquisition, Oracle is uniquely positioned as the only vendor to provide a complete, integrated stack – from storage to scorecard. And as Larry Ellison states, by having this integrated stack, we are able to deliver systems that are faster, more secure, easier to use and more cost effective.
Extreme Performance Data Warehousing Çetin Özbütün Vice President, Data Warehousing Technologies
The Rise of the Intelligent Economy “From recession comes an opportunity to reset a number of industry structures…there is an opportunity to infuse industries with technologies that position them to operate more effectively in the next 50 years.” Lessons Learned in Building the Intelligent Economy, May 2010
All Businesses Want Better Insight Industry Typical Questions Retail What stores should be closed or sold? Which customers will respond to new promotion? Telecommunications What are the issues effecting churn by region? What is the average revenue per user (ARPU)? Healthcare What are most common patient service requests? What is average level of clinical supplies on-hand? Financial Services How will new online services impact deposits? How does average loan compare to last year? Utilities Who do we target for energy efficiency program? What resources are needed to restore an outage? Public Sector What is the trend on budget and expenditures? What is most cost-effective way to manage waste?
Challenge: Much More Data to Analyze Data Warehouse Size and Growth Source: TDWI Next Generation Data Warehouse Platforms Report, 2009
Challenge: No Single Source of Truth Expensive Data Warehouse Architecture Data Marts OLAP ETL Data Mining Data Marts ETL OLAP Data Mining
Challenge: User Requirements Not Met High Churn in Data Warehouse Platforms Source: TDWI Next Generation Data Warehouse Platforms Report, 2009
DW Strategy Single source of truth Extreme performance Lower cost of ownership Deeper Insight
DW Strategy Single source of truth Extreme performance Lower cost of ownership Deeper Insight
A Single Source of Truth? Movie location see footnote A Single Source of Truth? Optional 2 minute video ‘A Single Source of Truth’ that explains benefit of data and server consolidation with Sun Oracle Database Machine. WMV format video can be downloaded from here: http://database.us.oracle.com/pls/htmldb/Z?p_url=http://files.oraclecorp.com/content/AllPublic/Users/Users-J/john.brust-Public/Database%2520Vignettes/OracleExadataConsolidateAnalytics-8409751.wmv&p_cat=92491&p_id=46&p_company=501318803116695 same library as PPT and video should play OK when clicked in slideshow mode.
Oracle Database 11g Oracle Exadata Database Machine Consolidate Onto a Single Platform Faster Performance, Single Source of Truth Data Marts Data Mining Online Analytics ETL Oracle Database 11g Oracle Exadata Database Machine
Oracle Exadata Database Machine For OLTP, Data Warehousing & Consolidated Workloads Improve query performance by 10x Better insight into customer requirements Expand revenue opportunities Consolidate OLTP and analytic workloads Lower admin and maintenance costs Reduce points of failure Integrate analytics and data mining Complex and predictive analytics Lower risk Streamline deployment One support contact 12
Oracle Exadata Database Machine Family Oracle Exadata Database Machine X2-2 Oracle Database Server Grid 8 2-processor Database Servers 96 CPU Cores 768 GB Memory Exadata Storage Server Grid 14 Storage Servers 5 TB Smart Flash Cache 336 TB Disk Storage Unified Server/Storage Network 40 Gb/sec Infiniband Links Available in full, half, quarter racks 13
Oracle Exadata Database Machine Family Oracle Exadata Database Machine X2-8 Oracle Database Server Grid 2 8-processor Database Servers 128 CPU Cores 2 TB Memory Oracle Linux or Solaris 11 Express Exadata Storage Server Grid 14 Storage Servers 5 TB Smart Flash Cache 336 TB Disk Storage Unified Server/Storage Network 40 Gb/sec Infiniband Links 14
Traditional Query Problem What Were Yesterday’s Sales? Select sum(sales) where salesdate= ‘22-Jan-2010’… Return entire Sales table Discard most of sales table Sum Data is pushed to database server for processing I/O rates are limited by speed and number of disk drives Network bandwidth is strained, limiting performance and concurrency
Exadata Smart Scan Improve Query Performance by 10x or More What Were Yesterday’s Sales? Select sum(sales) where salesdate= ‘22-Jan-2010’… Return Sales for Jan 22 2010 Sum Off-load data intensive processing to Exadata Storage Server Exadata Storage Server only returns relevant rows and columns Wide Infiniband connections eliminate network bottlenecks
Exadata Storage Index Transparent I/O Elimination with No Overhead B C D 1 3 5 8 Index Min B = 1 Max B =5 Select * from Table where B<2 - Only first set of rows can match Min B = 3 Max B =8 Maintain summary information about table data in memory Eliminate disk I/Os if MIN / MAX never match “where” clause Completely automatic and transparent 17
Exadata Hybrid Columnar Compression Reduce Disk Space Requirements Uncompressed Data Data Warehouse Appliances OLTP Data DW Data Archive Data Oracle
Built-in Analytics Secure, Scalable Platform for Advanced Analytics Oracle OLAP Analyze and summarize Oracle Data Mining Uncover and predict Complex and predictive analytics embedded into Oracle Database 11g Reduce cost of additional hardware, management resources Improve performance by eliminating data movement and duplication
Infrequently Used Data Exadata Smart Flash Cache Extreme Performance for OLTP Applications Frequently Used Data Infrequently Used Data Automatically caches frequently-accessed ‘hot’ data in flash storage Assigns the rest to less expensive disk drives Know when to avoid trying to cache data that will never be reused Process data at 50GB/sec and up to 1million I/Os per second 20
With Partition Pruning Benefits Multiply Converting Terabytes to Gigabytes 10 TB of User Data 1 TB of User Data 100 GB of User Data 10 TB of User Data With 10x Compression With Partition Pruning SmartScan is very powerful, but just one of the ways the Database Machine ensures high I/O bandwidth. For DW/BI applications, the many unique software capabilities of the Database Machine combine to vastly eliminate I/O. Compression, pruning, storage indexes add aditional I/O elimination in addition to SmartScan – walk through this example. Emphasize the “effective” outcome of scanning 10 TB of data in less than 1 second, through the combination of all of the techniques. 20 GB of User Data 5 GB of User Data Sub second “10 TB” Scan With Storage Indexes 10 TB of User Data With Smart Scan No Indexes
ETL with Oracle Fast data loading using DBFS and External Tables BCP Unload Staging Raw Files Parallel Loads FTP Non-Oracle Source Data Pump Unload SCP Oracle Source Fast data loading using DBFS and External Tables Fast transforms in Oracle Database 11g via Parallel DML operations Best-in-class performance for large batch oriented data loads
Turkcell Runs 10x Faster on Exadata Compresses Data Warehouse by 10x Replaced high-end SMP Server and 10 Storage Cabinets Reduced Data Warehouse from 250TB to 27TB Using OLTP & Hybrid Columnar Compression Ready for future growth where data doubles every year Experiencing 10x faster query performance Delivering over 50,000 reports per month Average report runs reduced from 27 to 2.5 mins Up to 400x performance gain on some reports
Softbank Runs 2x–8x Faster on Exadata 36 Teradata Racks Replaced by 3 Exadata Racks Exadata has more disk drives per rack, larger disk drives (2TB) and much better compression. This means that Exadata can hold much more User Data than other systems, and costs much less per user Terabyte. Exadata User Data per SATA rack with 10x compression is 500TB. Teradata has fallen very far behind in compression technology which makes them much more costly for large data environments. Teradata 2580 holds 45 TB per cabinet using max sized 1TB drive and 1.3x compression (taken from Teradata specifications). A single Exadata rack matches the user data capacity of the largest size Teradata 2580 (12s cabinet holding 517 TB user data). The flagship Teradata 5600 is hold even less data per rack than the 2580. There is approximately a 20:1 Ratio of user data per rack comparing Exadatda to Teradata 5600. Netezza Twinfin 32 TB Uncompressed per rack, 128 TB Compressed (assuming their maximum 4x compression).
Workload Management for DW Setting Up a Workload Management System Define Workloads Filter Exceptions Manage Resources Monitor Workloads Adjust Plans Execute Workloads Monitor Workloads Adjust Workload Plans IORM RAC OEM DBRM Define Workload Plans The RAC piece includes things like: Services Server Pools (Grid Infrastructure) to provide elasticity (add servers to pool to increase memory) Instance Caging (consolidation)
Workload Management Request Queue Execute Assign Ad-hoc Workload Each request: Executes on a RAC Service Which limits the physical resources Allows scalability across racks Assign Each request assigned to a consumer group: OS or DB Username Application or Module Action within Module Administrative function Ad-hoc Workload Each consumer group has: Resource Allocation (example: 10% of CPU/IO resources) Directives (example: 20 active sessions) Thresholds (example: no jobs longer than 2 min) Reject Downgrade
Workload Management Request Real-Time ETL Batch ETL Analytic Reports Assign Execute Execute OLTP Requests Ad-hoc Workload Queue Downgrade Reject
Workload Management Request Real-Time ETL Queue R-T 10% Batch ETL Analytic Reports Analytic Reports 50% Queue Assign OLTP Requests OLTP 5% Reject Downgrade Queue Ad-hoc 25% Ad-hoc Workload Queue
Oracle Exadata for Data Warehousing Movie location see footnote Oracle Exadata for Data Warehousing Optional 3 minute video from customer BioWare that explains benefit of Oracle Exadata Database Machine for data warehousing in the online games industry. WMV format video can be downloaded from here: http://database.us.oracle.com/pls/htmldb/Z?p_url=http://stcontent.oracle.com/content/dav/oracle/Libraries/ST%20Product%20Management/ST%20Product%20Management-Public/11gR2/BioWare_Exadata.wmv&p_cat=98001&p_id=46&p_company=501318803116695 same library as PPT and video should play when cliicked when in slideshow mode.
Yaz Iida Chief Executive Officer LinkShare “Our continued investments in resources and technology, including the new Oracle Exadata database, is providing advertisers and publishers with the increased performance, usability, and innovation that will help drive strong revenue growth today and in the future.” LinkShare press release, 9/28/2010 http://econsultancy.com/us/press-releases/5197-linkshare-debuts-new-advertiser-dashboard
Vinod Haval Vice President and Manager, Database Paroducts Bank of America "We need one solution, one architecture. From that perspective, Exadata provides the right platform for consolidating database operations.” Source: InfoWeek article 9/25/10 http://www.informationweek.com/news/business_intelligence/analytics/showArticle.jhtml?articleID=227500637
Oracle Exadata Momentum Rapid adoption in all geographies and industries
Oracle Database 11g The Best Database for Data Warehousing Real Application Clusters Advanced Compression Partitioning OLAP Data Mining World record performance for fast access to information Manage growing volumes of information cost-effectively Reduce costs through server and data consolidation
The Concept of Partitioning Maintain Consistent Performance as Database Grows SALES SALES SALES Europe USA Jan Feb Jan Feb Large Table Difficult to Manage Partition Divide and Conquer Easier to Manage Improve Performance Composite Partition Higher Performance Match to business needs
Partition for Performance Partition Pruning Sales Table 5/19 What was the total sales amount for May 20 and May 21 2010? Select sum(sales_amount) From SALES Where sales_date between to_date(‘05/20/2010’,’MM/DD/YYYY’) And to_date(‘05/22/2010’,’MM/DD/YYYY’); 5/20 5/21 5/22 Performs operations only on relevant partitions Dramatically reduces amount of data retrieved from disk Improves query performance and optimizes resource utilization
Partition to Manage Data Growth Compress Data and Lower Storage Costs Archive Data Read Only Data Active Data 15-50x Archive Compression 10-15x DW Compression 3x OLTP Compression Distribute partitions across multiple compression tiers Free up storage space and execute queries faster No changes to existing applications
In-Memory Parallel Query in Database Tier In-Memory Parallel Execution Efficient use of memory on clustered servers In-Memory Parallel Query in Database Tier Compress more data into available memory on cluster Intelligent algorithm Places table fragments in memory on different nodes Reduces disk IO and speeds query execution © 2010 Oracle Corporation
Automated Degree of Parallelism Queue statements if not enough parallel servers available 64 32 16 When required number of servers are available, execute first statement Automatically determine DOP 8 Execute immediately Enough parallel servers available Optimizer derives the best Degree of Parallelism Based on resource requirements of all concurrent operations Less DBA management, better resource utilization 38
Relational Star Schema Summary Management Improve Response Time with Materialized Views SQL Query Region Date Sales by Date Sales by Product Sales by Region Sales by Channel Query Rewrite Relational Star Schema Products Channel Materialized Views Pre-summarized information stored within Oracle Database 11g Separate database object, transparent to queries Supports sophisticated transparent query rewrite Fast incremental refresh of changed data 39
Cube Organized Materialized Views SQL Query Summaries Region Date Query Rewrite Automatic Refresh Products Channel Exposes Oracle OLAP cubes as relational materialized views Provides SQL access to data stored in an OLAP cubes Any BI tool or SQL application can leverage OLAP cubes
DW Strategy Single source of truth Extreme performance Lower cost of ownership Deeper Insight
In-database Analytics Bring Algorithms to the Data, Not Data to the Algorithms Analytic computations done in the database Dimensional analysis Statistical analysis Data Mining Scalability Security Backup & Recovery Simplicity OLAP Statistics Data Mining
Oracle OLAP Built-in Access to Analytic Calculations How do sales in the Western region this quarter compare with sales a year ago? What will sales next quarter be? What factors can we alter to improve the sales forecast? Multidimensional analytic engine that analyzes summary data Offers improved query performance and fast, incremental updates Embedded in Oracle Database instance and storage
Oracle OLAP and OBIEE Calculations Computed Faster in OLAP Engine
Oracle Data Mining Find Hidden Patterns, Make Predictions Retail Financial Services Customer Segmentation Response Modeling Credit Scoring Possibility of default Communications Utilities Customer churn Network intrusion Product bundling Predict power line failure Healthcare Public Sector Patient outcome prediction Fraud detection Tax fraud Crime analysis Collection of data mining algorithms that solve business problems Simplifies development of predictive BI applications Embedded in Oracle Database instance and storage
Oracle Data Mining and OBIEE Prediction and Probability Results Integrated in Reports
Oracle Spatial and OBIEE Enrich BI with map visualization of Oracle Spatial data Enable location analysis in reporting, alerts and notifications Use maps to guide data navigation, filtering and drill-down Increase ROI from geospatial and non-spatial data
Oracle Exadata Intelligent Warehouse For Industries Data Models Business Intelligence Exadata Combine deep industry knowledge with data warehousing expertise Help jump-start design and implementation of data warehouses Available for Retail and Communications industries
Oracle Industry Data Models Reference Data Model Aggregate Data Model Relational (STAR) for BI OLAP for Analytical Derived Data Model Data Mining/Complex Reports/Query Base Data Model (3NF) Atomic Level of Transaction Data Combine deep industry knowledge with data warehousing expertise Help jump-start design and implementation of data warehouses Optimized for Oracle Database 11g and Oracle Exadata
Oracle Data Warehousing What Customers Think… Movie location see footnote Oracle Data Warehousing What Customers Think… Optional 1 minute video montage of customers discussing the benefits of Oracle Database 11g (use instead of quote slides). If you don’t want to use, see hidden customer quote slides that follow. WMV format video can be downloaded from here: Event Kit pages => http://my.oracle.com/portal/page/myo/ROOTCORNER/SALES_KIT_REPOSITORY/Products/Database%20and%20Information%20Management/database_data_warehouse/Data%20Warehousing%2009%20_7709486.mpg into same library as PPT and simply ‘insert movie from file’ to embed into this slide.
Henry Lovoy Data Manager HealthSouth Corporation “Oracle Database 11g, along with Oracle Real Application Clusters, Advanced Compression and Partitioning, all lend themselves to delivering highly available, high performance data warehousing.” Source: 4/12/10 press release http://www.oracle.com/us/corporate/press/068139
Extreme Performance Data Warehousing Integrated Technology Stack Smart Storage Database Data Models ELT Tools BI Tools BI Applications Single source of truth Easy to deploy and manage Extreme performance Meets all end user requirements Lower cost of ownership
Data Warehouse Reference Architecture
Data Warehouse Reference Architecture Base data warehouse schema Atomic-level data, 3nf design Supports general end-user queries Data feeds to all dependent systems Application-specific performance structures Summary data / materialized views Dimensional view of data Supports specific end-users, tools, and applications
Oracle #1 for Data Warehousing Source: IDC, July 2009 – “Worldwide Data Warehouse Management Tools 2008 Vendor Shares”