Analyze/Report from large Volumes of Data

Slides:



Advertisements
Similar presentations
1/17/20141 Leveraging Cloudbursting To Drive Down IT Costs Eric Burgener Senior Vice President, Product Marketing March 9, 2010.
Advertisements

Extreme Performance with Oracle Data Warehousing
Blazing Queries: Using an Open Source Database for High Performance Analytics July 2010.
Supervisor : Prof . Abbdolahzadeh
1. Complete and integrated BI and Performance Management offering Complete and integrated BI and Performance Management offering Widespread delivery of.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
1. SQL Server 2014 In-Memory by Design Arthur Zubarev June 21, 2014.
Chapter 13 The Data Warehouse
C6 Databases.
Maximize WebFOCUS Performance with Hyperstage
Peter Azzarello April 11, 2012 IB Toronto User Forum WebFOCUS Hyperstage Overview Summit 2012.
10 REASONS Why it makes a good option for your DB IN-MEMORY DATABASES Presenter #10: Robert Vitolo.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
A Fast Growing Market. Interesting New Players Lyzasoft.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 14 The Second Component: The Database.
Microsoft SQL Server x 46% 900+ For Hosting Service Providers
Chapter 13 The Data Warehouse
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Designing a Data Warehouse
Extreme Performance Data Warehousing
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Components of the Data Warehouse Michael A. Fudge, Jr.
SM STRATA PRESENTATION Tim Garnto - SVP Engineering, edo Interactive Rob Rosen – Big Data Field Lead, Pentaho.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Copyright 2007, Information Builders. Slide 1 Extending WebFOCUS Business Intelligence to Mobile devices Toronto User Group Information Builders (Canada)
1.
Understanding Data Warehousing
M icrosoft Data Warehousing - SQL Server State of the Technology Presentation by Sujata Angara Nakul Johri Sang Ho Park.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Rodney Holman Mandip Kaur Information Builders  Company Name: Information Builders  CEO and Founder: Gerald D. Cohen  Address: Two Penn Plaza, New.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
September 2011Copyright 2011 Teradata Corporation1 Teradata Columnar.
Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.
OnLine Analytical Processing (OLAP)
Datawarehouse Objectives
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Dan Grady The search for the killer productivity application is over… Copyright 2009, Information Builders. Slide 1.
Information Builders : SmartMart Seon-Min Rhee Visualization & Simulation Lab Dept. of Computer Science & Engineering Ewha Womans University.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Louisville User Group Meeting April 25, 2012 Lori Pieper Maximize WebFOCUS Performance with Hyperstage.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
SAM for SQL Workloads Presenter Name.
What is OLAP?.
Enterprise Solutions Chapter 11 – In-memory Technology.
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
Supervisor : Prof . Abbdolahzadeh
Chapter 13 The Data Warehouse
Informix Red Brick Warehouse 5.1
Data Warehouse.
Competing on Analytics II
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehousing Concepts
Applying Data Warehouse Techniques
Analytics, BI & Data Integration
Presentation transcript:

Analyze/Report from large Volumes of Data WebFOCUS Hyperstage Analyze/Report from large Volumes of Data Information Builders May 11, 2012 Information Builders (Canada) Inc.

WebFOCUS Higher Adoption & Reuse with Lower TCO Mobile Applications Data Updating Visualization & Mapping Predictive Analytics Enterprise Search High Performance Data Store Extended BI Performance Management MS Office & e-Publishing Query & Analysis Dashboards Reporting Information Delivery Core BI Data Warehouse & ETL Business to Business Data Profiling & Data Quality Business Activity Monitoring Master Data Management Extensions to the WebFOCUS platform allow you to build more application types at a lower cost

WebFOCUS High Performance Data Store Mobile Applications Data Updating Visualization & Mapping Predictive Analytics Enterprise Search High Performance Data Store Extended BI Performance Management MS Office & e-Publishing Query & Analysis Dashboards Reporting Information Delivery Core BI Data Warehouse & ETL Business to Business Data Profiling & Data Quality Business Activity Monitoring Master Data Management Extensions to the WebFOCUS platform allow you to build more application types at a lower cost

The Business Challenge Big Data

Today’s Top Data-Management Challenge Big Data and Machine Generated Data Storage Machine- Generated Data Human-Generated Data Time

IT Managers try to mitigate these response times ….. How Performance Issues are Typically Addressed – by Pace of Data Growth When organizations have long running queries that limit the business, the response is often to spend much more time and money to resolve the problem Source: KEEPING UP WITH EVER-EXPANDING ENTERPRISE DATA ( Joseph McKendrick Unisphere Research October 2010)

Classic Approaches and Challenges Data Warehousing More Data, More Data Sources Limited Resources and Budget More Kinds of Output Needed by More Users, More Quickly 0101010101010101010101010101 Real time data 0101010101010101010101010101 10 10 101 0101010101010101010101010 Multiple databases 0101010101010101010101010 10 10 01 10 External Sources 0101010101010101010101 0101010101010101010101 101 10 1 1 10 101 10 10 01 01 10 10 01 1 1 1 01 1 1 10 10 10 10 01 101 01 1 10 01 1 10 1 01 10 10 101 101 1 101 1 10 1010 10 101 01 010 1 1 1 10 1 1010 010 1 1 01 01 101 0101 10 1 101 01 101 Labour intensive, heavy indexing, aggregations and partitioning Hardware intensive: massive storage; big servers Expensive and complex Traditional Data Warehousing

Classic Approaches and Challenges Data Warehousing – Growing Demands New Demands: Larger transaction volumes driven by the internet Impact of Cloud Computing More -> Faster -> Cheaper Data Warehousing Matures: Near real time updates Integration with master data management Data mining using discrete business transactions Provision of data for business critical applications Early Data Warehouse Characteristics: Integration of internal systems Monthly and weekly loads Heavy use of aggregates

Classic Approaches and Challenges Dealing with Large Data INDEXES CUBES/OLAP

Classic Approaches and Challenges Limitations of Indexes Increased Space requirements Sum of Index Space requirements can exceed the source DB Index Management Increases Load times Building the index Predefines a fixed access path

Classic Approaches and Challenges Limitations of OLAP Cube technology has limited scalability Number of dimensions is limited Amount of data is limited Cube technology is difficult to update (add Dimension) Usually requires a complete rebuild Cube builds are typically slow New design results in a new cube

Limitations of Rows These Solutions Contribute to Operational Limitations Impediments to business agility wait for DBAs to create indexes or other tuning structures, thereby delaying access to data. Indexes significantly slow data-loading operations and increase the size of the database, sometimes by a factor of 2x. Loss of data and time fidelity: ETL operations typically performed in batch during non-business hours. Delay access to data, often result in mismatches between operational and analytic databases. Limited ad hoc capability: Response times for ad hoc queries increase as the volume of data grows. Unanticipated queries (where DBAs have not tuned the database in advance) can result in unacceptable response times. Unnecessary expenditures: Attempts to improve performance using hardware acceleration and database tuning schemes raise the capital costs of equipment and the operational costs of database administration. Added complexity of managing a large database diverts operational budgets away from more urgent IT projects. Many IT organizations and technology solution providers rely on traditional relational databases for their data warehouse, data mart or analytic repository. The problem is that those databases were designed for transactional applications, not analytics against large data volumes. As a result, many companies find that as the volume of data grows, those systems cannot meet the performance requirements from users. In addition, traditional database technology requires a high degree of effort (such as creating/ maintaining indexes, creating cubes or projections, or partitioning data) and are costly to license and maintain. Let’s Discuss Row Based approaches in more detail ….

Pivoting Your Perspective: Columnar Technology ….

The Limitation of Rows The Ubiquity of Rows 30 columns Row-based databases are ubiquitous because so many of our most important business systems are transactional. Row-oriented databases are well suited for transactional environments, such as a call center where a customer’s entire record is required when their profile is retrieved and/or when fields are frequently updated. 50 millions Rows Where row-based databases run into trouble is when they are used to handle analytic loads against large volumes of data, especially when user queries are dynamic and ad hoc. To see why, let’s look at a database of sales transactions with 50-days of data and 1 million rows per day. Each row has 30 columns of data. So, this database has 30 columns and 50 million rows. Say you want to see how many toasters were sold for the third week of this period. A row-based database would return 7-million rows (1 million for each day of the third week) with 30 columns for each row—or 210-million data elements. That’s a lot of data elements to crunch to find out how many toasters were sold that week. As the Data Set data set increases in size, disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query. As we mentioned above, many companies try to solve this I/O problem by creating indices to optimize queries. This may work for routine reports (i.e. you always want to know how many toasters you sold for the third week of a reporting period) but there is a point of diminishing returns as load speed degrades since indices need to be recreated as data is added. But - Disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query.

Pivoting Your Perspective Columnar Technology Employee Id Name Location Sales 1 Smith New York 50,000 2 Jones New York 65,000 3 Fraser Boston 40,000 4 Fraser Boston 70,000 Row Oriented (1, Smith, New York, 50000; 2, Jones, New York, 65000; 3, Fraser, Boston, 40000; 4, Fraser, Boston, 70000) Works well if all the columns are needed for every query. Efficient for transactional processing if all the data for the row is available Column Oriented (1, 2, 3, 4; Smith, Jones, Fraser, Fraser; New York, New York, Boston, Boston, 50000, 65000, 40000, 70000) Works well with aggregate results (sum, count, avg. ) Only columns that are relevant need to be touched Consistent performance with any database design Allows for very efficient compression

WebFOCUS Hyperstage

Introducing WebFOCUS Hyperstage Mission Improve database performance for WebFOCUS applications with less hardware, no database tuning, and easy migration What is WebFOCUS Hyperstage High performance analytic data store Designed to handle business-driven queries on large volumes of data without IT intervention. Easy to implement and manage, Hyperstage provides answers to your business users need at a price you can afford Advantages Dramatically increase performance of WebFOCUS applications Disk footprint reduced with powerful compression algorithm = faster response time Embedded ETL for seamless migration of existing analytical databases No change in query or application required Includes optimized Hyperstage Adapter WebFOCUS metadata can be used to define hierarchies and drill paths to navigate the star schema

Introducing WebFOCUS Hyperstage How it is architected Hyperstage Engine Knowledge Grid Compressor Bulk Loader Combines a columnar database with intelligence we call the Knowledge Grid to deliver fast query responses. Unmatched Administrative Simplicity No Indexes No data partitioning No Manual tuning Improve database performance for WebFOCUS applications with less hardware, no database tuning, and easy migration

Introducing WebFOCUS Hyperstage What it means for Customers Self-managing: 90% less administrative effort Low-cost: More than 50% less than alternative solutions Scalable, high-performance: Up to 50 TB using a single industry standard server Fast queries: Ad hoc queries are as fast as anticipated queries, so users have total flexibility Compression: Data compression of 10:1 to 40:1 means a lot less storage is needed, it might mean you can get the entire database in memory!

Introducing WebFOCUS Hyperstage How it works Create Information (Metadata) about the data, and, upon Load, automatically … Stores it in the Knowledge Grid (KG) KG Is loaded into Memory Less than 1% of compressed data Size Uses the metadata when Processing a query to Eliminate / reduce need to access data The less data that needs to be accessed, the faster the response Sub-second responses when answered by KG Architecture Benefits No Need to partition data, create/maintain indexes projections, or tune for performance Ad hoc queries are as fast as static queries, so users have total flexibility

WebFOCUS Hyperstage Engine How it works Column Orientation Smarter Architecture No maintenance No query planning No partition schemes No DBA Knowledge Grid – statistics and metadata “describing” the super-compressed data Data Packs – data stored in manageably sized, highly compressed data packs Data compressed using algorithms tailored to data type

Summary

Business Intelligence – Meeting Requirements

WebFOCUS Hyperstage The Big Deal No indexes No partitions No views No materialized aggregates Value proposition Low IT overhead Allows for autonomy from IT Ease of implementation Fast time to market Less Hardware Lower TCO No DBA Required!

WebFOCUS Hyperstage Adapter What it looks like

WebFOCUS Hyperstage Adapter What it looks like

Example – Focus to Hyperstage Compression 243639 Rows

Q&A