Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries.

Slides:



Advertisements
Similar presentations
Extreme Performance with Oracle Data Warehousing
Advertisements

Advanced Oracle DB tuning Performance can be defined in very different ways (OLTP versus DSS) Specific goals and targets must be set => clear recognition.
XIr2 Recommended Performance Tuning Andy Erthal BI Practice Manager.
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advanced SQL Schema Customization & Reporting Presented By: John Dyke As day to day business needs become more complex so does the need for specifically.
Supervisor : Prof . Abbdolahzadeh
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
Multi-Mode Survey Management An Approach to Addressing its Challenges
Help! My table is getting too big! How to divide and conquer SQL Relay 2014.
Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
Introduction to DBA.
INTEGRATING BIG DATA TECHNOLOGY INTO LEGACY SYSTEMS Robert Cooley, Ph.D.CodeFreeze 1/16/2014.
1. Aim High with Oracle Real World Performance Andrew Holdsworth Director Real World Performance Group Server Technologies.
A Fast Growing Market. Interesting New Players Lyzasoft.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Self-Tuning and Self-Configuring Systems Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems March 16, 2005.
5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.
Simplify your Job – Automatic Storage Management Angelo Session id:
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
Convergence /20/2017 © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Ch 4. The Evolution of Analytic Scalability
1 MS SQL Server 7.0 Project Demo by: Amritaputra Bhattacharya Avik Sarkar Kaushik Das Srijit Maiti.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
CSC271 Database Systems Lecture # 30.
Workflow Manager and General Tuning Tips. Topics to discuss… Working with Workflows Working with Tasks General Tuning Tips.
Ahsan Abdullah 1 Data Warehousing Lecture-17 Issues of ETL Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ch 5. The Evolution of Analytic Processes
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Goals Deploy a BI foundation that meets scaling requirements, offers speed, flexibility and simplicity to delivery requirements Provide users and customers.
Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Our Process We Threw Out Preconceptions and Left No Stone Unturned We looked at white papers, articles, Gartner and Forrester reports, and marketing collateral.
SESSION CODE: BIE07-INT Eric Kraemer Senior Program Manager Microsoft Corporation.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
(C) 2008 Clusterpoint(C) 2008 ClusterPoint Ltd. Empowering You to Manage and Drive Down Database Costs April 17, 2009 Gints Ernestsons, CEO © 2009 Clusterpoint.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Criteria for D/W Platform Selection Simple Architecture –Easy to deploy the solution with minimal efforts Scalable (Scale Out - Scale Up) –Ability to handle.
To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.
© 2002 Global Knowledge Network, Inc. All rights reserved. Windows Server 2003 MCSA and MCSE Upgrade Clustering Servers.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Copyright ©2003 Dell Inc. All rights reserved. Scaling-Out with Oracle® Grid Computing on Dell™ Hardware J. Craig Lowery, Ph.D. Software Architect and.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
3 Copyright © 2006, Oracle. All rights reserved. Designing and Developing for Performance.
BIG DATA/ Hadoop Interview Questions.
Session Name Pelin ATICI SQL Premier Field Engineer.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Supervisor : Prof . Abbdolahzadeh
Understanding and Improving Server Performance
Flash Storage 101 Revolutionizing Databases
Maximum Availability Architecture Enterprise Technology Centre.
Informix Red Brick Warehouse 5.1
The Client/Server Database Environment
Introduction to NewSQL
Introduction of Week 3 Assignment Discussion
MANAGING DATA RESOURCES
Ch 4. The Evolution of Analytic Scalability
Database System Architectures
Performance Tuning ETL Process
SQL Server 2016 High Performance Database Offer.
Presentation transcript:

Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries need Parallelism to break the job up across many processors to increase performance. –Oracle is not designed for extreme parallelism Parallelism is a part-time feature set object by object, query by query It is not designed to let every query run parallel It is not designed to let any single query run across all processors and server nodes It can support few wide (high parallelism) or many narrow (low/no parallelism) but not both at the same time When it runs out of parallel processes, things default to serial mode with a fraction of normal performance –Our ETL has design and operational issues resulting from parallelism issues ETL schedules with artificial dependencies to insure enough parallel processes are available Often big ETL workflows that start during heavy work load get no parallelism, run very slow, and have to be restarted later when parallel processes are available –Report query performance can give inconsistent performance because of parallelism issues Manual SQL Tuning Required –Most current ETL has overwritten SQL which is heavily tuned to achieve best performance –Ad hoc reporting doesn’t allow for manual tuning, so query plans are often sub-optimal to the point of non-functional performance –When optimizer chooses poor join plans, queries often fail because they run out of temp space Load Complexities with Partition Exchange Loading –Large complex database procedures have been written to create, prep, index, stats gather, and exchange in partitions –Partition exchange procedures have proven very challenging for DBAs to support and enhance to meet new needs –Partition exchange procedures have to be called from ETL mappings adding complexity to ETL development and support Complex Configuration –The database configuration settings have to be custom configured by our people for our multi-vendor hardware stack –Configuration has proven to be very complex –Oracle RAC adds further complexity with configuration of disk cluster management software to support its share everything nature –Interconnect traffic and speed to support shared cache is always a concern with RAC –RAC has proven to have a high number of bugs associated to it through the years at our company End Results –Can not run complex business queries and ETL fast by utilizing all hardware –New summarized marts or tables are needed to answer new questions adding overall complexity and greatly slowing business responsiveness –Can not support large numbers of concurrent queries when tuned to handle large queries (current state) –Can not effectively support complex queries created by MicroStrategy –Very High Complexity in support and development

External Findings MPP share nothing architecture of Teradata and Netezza is best practice –Supports extreme parallelism for extreme performance with monster queries and ETL –Performs by applying lots of inexpensive hardware to the problem…Brute force –Each process and processor works completely independently and autonomously on its own data set –Everything runs fully parallel across all processors/disks –Workload is managed through queue management, not by spawning more processes contending for the same data Most very happy MicroStrategy customers are running against Teradata or Netezza Appliances eliminate complex custom configurations, reduces TOC, and gives customers “One Throat to Choke” Database software built for data warehousing implicitly handles tasks such as storage/data layout, partitioning, parallelism, and high efficiency loading Built ground up cost based optimizer is best for query plans in BI –Oracle evolved from a rule based optimizer –Netezza and Teradata are cost based from the beginning TPC-H warehousing benchmarks test with a workload of a high number of small queries with high marks going to the conventional OLTP databases like Oracle, but not representative of what we need

Benefits to application development More efficient and effective data modeling –Less pre-aggregation and restructuring of data is needed to meet reporting performance requirements –Data models can focus on meeting business needs instead of being performance focused More efficient ETL development –High load and transformation performance means less issues around load windows and ability to load history –No custom built partition management and exchange logic –No extra partition load logic in ETL –Minimized SQL tuning More flexible and effective reporting –Extreme improvement in conventional reporting performance –Ability to support complex analysis in MicroStrategy with good response time –Ability to do analysis across very large sets of data High performance querying against atomic data –Complete and timely data analysis for projects –Fast prototyping of marts and reporting –One off questions to be answered by large queries, not large BI projects