Peter Azzarello April 11, 2012 IB Toronto User Forum WebFOCUS Hyperstage Overview Summit 2012.

Slides:



Advertisements
Similar presentations
Analyze/Report from large Volumes of Data
Advertisements

Dimensional Modeling.
Supervisor : Prof . Abbdolahzadeh
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
C6 Databases.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Maximize WebFOCUS Performance with Hyperstage
SQL Server Accelerator for Business Intelligence (SSABI)
10 REASONS Why it makes a good option for your DB IN-MEMORY DATABASES Presenter #10: Robert Vitolo.
Technical BI Project Lifecycle
Management Information Systems, Sixth Edition
WebFOCUS Update: Location Intelligence Copyright 2007, Information Builders. Slide 1 Dan Ortolani Vice President, Advanced Technology Services.
Teradata Columnar: A new standard for Columnar databases Source: Teradata is thinking Big Stephen Swoyer Presented by: Deesha Phalak and Kaushiki Nag.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
A Fast Growing Market. Interesting New Players Lyzasoft.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Chapter 3 Database Management
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 13 The Data Warehouse
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Designing a Data Warehouse
Extreme Performance Data Warehousing
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Copyright 2007, Information Builders. Slide 1 Extending WebFOCUS Business Intelligence to Mobile devices Toronto User Group Information Builders (Canada)
1.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
September 2011Copyright 2011 Teradata Corporation1 Teradata Columnar.
OnLine Analytical Processing (OLAP)
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Using SAS® Information Map Studio
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Dan Grady The search for the killer productivity application is over… Copyright 2009, Information Builders. Slide 1.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Louisville User Group Meeting April 25, 2012 Lori Pieper Maximize WebFOCUS Performance with Hyperstage.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
1 On-Line Analytic Processing Warehousing Data Cubes.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Two-Tier DW Architecture. Three-Tier DW Architecture.
What is OLAP?.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Oracle Business Intelligence Foundation - Commonly Used Features in Repository.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Supervisor : Prof . Abbdolahzadeh
Chapter 13 Business Intelligence and Data Warehouses
On-Line Analytic Processing
Chapter 13 The Data Warehouse
Informix Red Brick Warehouse 5.1
Data Warehouse.
Blazing-Fast Performance:
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehousing Concepts
Analytics, BI & Data Integration
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

Peter Azzarello April 11, 2012 IB Toronto User Forum WebFOCUS Hyperstage Overview Summit 2012

WebFOCUS Higher Adoption & Reuse with Lower TCO Reporting Query & AnalysisDashboards InformationDelivery PerformanceManagement EnterpriseSearch Visualization & Mapping Data Updating PredictiveAnalytics MS Office & e-Publishing Extended BI Core BI Extensions to the WebFOCUS platform allow you to build more application types at a lower cost Business to Business Data Warehouse & ETL Master Data Management Data Profiling & Data Quality Business Activity Monitoring High Performance Data Store MobileApplications

Copyright 2007, Information Builders. Slide 3 The Business Challenge Big Data

Big Data and Machine Generated Data Data Storage Time Machine- Generated Data Human-Generated Data Today’s Top Data-Management Challenge

Source: KEEPING UP WITH EVER-EXPANDING ENTERPRISE DATA ( Joseph McKendrick Unisphere Research October 2010) How Performance Issues are Typically Addressed – by Pace of Data Growth When organizations have long running queries that limit the business, the response is often to spend much more time and money to resolve the problem IT Manager’s try to mitigate these response times …..

Copyright 2007, Information Builders. Slide 6 Traditional Data Warehousing  Labor intensive, heavy indexing, aggregations and partitioning  Hardware intensive: massive storage; big servers  Expensive and complex More Data, More Data Sources More Kinds of Output Needed by More Users, More Quickly Limited Resources and Budget Real time data Multiple databases External Sources Data Warehousing Challenges

New Demands: Larger transaction volumes driven by the internet Impact of Cloud Computing More -> Faster -> Cheaper Data Warehousing Matures: Near real time updates Integration with master data management Data mining using discrete business transactions Provision of data for business critical applications Early Data Warehouse Characteristics: Integration of internal systems Monthly and weekly loads Heavy use of aggregates Data Warehousing Challenges

CUBES/OLAP Classic Approaches to deal with Large Data INDEXES

Limitations of Indexes  Increased Space requirements  Sum of Index Space requirements can exceed the source DB  Index Management  Increases Load times  Building the index  Predefines a fixed access path

Limitations of OLAP  Cube technology has limited scalability  Number of dimensions is limited  Amount of data is limited  Cube technology is difficult to update (add Dimension)  Usually requires a complete rebuild  Cube builds are typically slow  New design results in a new cube

Easy Migration to Hyperstage  Most cubes will be fed from a relational source  Common that relational source is a star schema  The source star schema can be migrated directly to Hyperstage  WebFOCUS metadata can be used to define hierarchies and drill paths to navigate the star schema

Copyright 2007, Information Builders. Slide 12 Pivoting Your Perspective: Columnar Technology ….

1. Impediments to business agility: Organizations often must wait for DBAs to create indexes or other tuning structures, thereby delaying access to data. In addition, indexes significantly slow data-loading operations and increase the size of the database, sometimes by a factor of 2x. 2. Loss of data and time fidelity: IT generally performs ETL operations in batch mode during non-business hours. Such transformations delay access to data and often result in mismatches between operational and analytic databases. 3. Limited ad hoc capability: Response times for ad hoc queries increase as the volume of data grows. Unanticipated queries (where DBAs have not tuned the database in advance) can result in unacceptable response times, and may even fail to complete. 4. Unnecessary expenditures: Attempts to improve performance using hardware acceleration and database tuning schemes raise the capital costs of equipment and the operational costs of database administration. Further, the added complexity of managing a large database diverts operational budgets away from more urgent IT projects. These Solutions Contribute to Operational Limitations The Limitation of Rows

Row-based databases are ubiquitous because so many of our most important business systems are transactional. Row-oriented databases are well suited for transactional environments, such as a call center where a customer’s entire record is required when their profile is retrieved and/or when fields are frequently updated. The Ubiquity of Rows … But - Disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query. 30 columns 50 millions Rows The Limitation of Rows

Row Oriented ( 1, Smith, New York, 50000; 2, Jones, New York, 65000; 3, Fraser, Boston, 40000; 4, Fraser, Boston, )  Works well if all the columns are needed for every query.  Efficient for transactional processing if all the data for the row is available  Works well with aggregate results (sum, count, avg. )  Only columns that are relevant need to be touched  Consistent performance with any database design  Allows for very efficient compression Column Oriented ( 1, 2, 3, 4; Smith, Jones, Fraser, Fraser; New York, New York, Boston, Boston, 50000, 65000, 40000, ) Pivoting Your Perspective: Columnar Technology Employee Id Name Smith Jones Fraser Location New York Boston Sales 50,000 65,000 40,000 4FraserBoston70,000

Employee Id Name Smith Jones Fraser Location New York Boston Sales 50,000 65,000 40,000 1SmithNew York50,000 2JonesNew York65,000 3FraserBoston40, SmithNew York50,000 JonesNew York65,000 Data stored in rows FraserBoston40,000 Data stored in columns Pivoting Your Perspective: Columnar Technology 4FraserBoston70,000 4FraserBoston70,0004FraserBoston70,000

Copyright 2007, Information Builders. Slide 17 Introducing WebFOCUS Hyperstage

The Hyperstage Mission Improve database performance for WebFOCUS applications with less hardware, no database tuning and easy migration.

The WebFOCUS Hyperstage high performance analytic data store is designed to handle business-driven queries on large volumes of data—without IT intervention. Easy to implement and manage, Hyperstage provides the answers to your business users need at a price you can afford. Introducing WebFOCUS Hyperstage …. What is it?

Hyperstage combines a columnar database with intelligence we call the Knowledge Grid to deliver fast query responses.. Introducing WebFOCUS Hyperstage …. How is it architected? Hyperstage Engine Knowledge Grid Compressor Bulk Loader Unmatched Administrative Simplicity No Indexes No data partitioning No Manual tuning

 Self-managing: 90% less administrative effort  Low-cost: More than 50% less than alternative solutions  Scalable, high-performance: Up to 50 TB using a single industry standard server  Fast queries: Ad-hoc queries are as fast as anticipated queries, so users have total flexibility  Compression: Data compression of 10:1 to 40:1 that means a lot less storage is needed, it might mean you can get the entire database in memory! Introducing WebFOCUS Hyperstage …. What does this mean for Customers?

Create Information (Metadata) about the data, and, upon Load, automatically … Create Information (Metadata) about the data, and, upon Load, automatically … Uses the metadata when Processing a query to Eliminate / reduce need to access data Uses the metadata when Processing a query to Eliminate / reduce need to access data Architecture Benefits o Stores it in the Knowledge Grid (KG) o KG Is loaded into Memory o Less than 1% of compressed data Size o The less data that needs to be accessed, the faster the response o Sub-second responses when answered by KG o No Need to partition data, create/maintain indexes projections, or tune for performance o Ad hoc queries are as fast as static queries, so users have total flexibility Introducing WebFOCUS Hyperstage …. How does it work?

WebFOCUS Hyperstage Runtime Architecture Hypercopy Hyperstage Server Hyperstage Engine MySQL WebFOCUS Server WebFOCUS Pro Server Hyperstage Adapter Knowledge Grid Compressor Bulk Loader Hypercopy Hyperstage Server Hyperstage Engine WebFOCUS Server WebFOCUS Hyperstage Adapter Knowledge Grid Compressor Bulk Loader

Smarter Architecture  No maintenance  No query planning  No partition schemes  No DBA Data Packs – data stored in manageably sized, highly compressed data packs Knowledge Grid – statistics and metadata “describing” the super-compressed data Column Orientation WebFOCUS Hyperstage Engine Data compressed using algorithms tailored to data type How does it work?

Knowledge Grid – Consists of knowledge nodes that are generated as the data is loaded and dynamic knowledge nodes that are created during query execution.  Not Manually Created - Unlike the indexes required for traditional databases, the knowledge grid is automatically generated does not require ongoing care and maintenance.  Limited Overhead - Knowledge Grid provides a high level view of the entire content of the database with a minimal overhead of approximately 1% of the compressed data. (By contrast, classic indexes may represent as much as 200% of the size of the original data.) Data Organization and the Knowledge Grid ….

Copyright 2007, Information Builders. Slide 26 Summary

Copyright 2007, Information Builders. Slide 27 Business Intelligence – Meeting Requirements

WebFOCUS Hyperstage The Big Deal…  No indexes  No partitions  No views  No materialized aggregates  Value proposition  Low IT overhead  Allows for autonomy from IT  Ease of implementation  Fast time to market  Less Hardware  Lower TCO No DBA Required!

What’s it look like?

Pay no attention to that man behind the curtain.  CREATE FILE baseapp/pa_inventory_ind_t DROP  -RUN  BULKLOAD baseapp/pa_inventory_ind_t FOR SQLINLD INV_CODE; TYPE; CATEGORY; NAME; MODEL; MEASURE1_INV; MEASURE2_INV; MEASURE3_INV;  JOIN  SYMBOLS.SYMBOLS.SYMBOL IN SYMBOLS TO MULTIPLE QUOTES_2B.QUOTES_2B.SYMBOL  IN QUOTES_2B TAG J0 AS J0  END  TABLE FILE SYMBOLS  PRINT  SYMBOL CLOSE_DATE CLOSE_PRICE VOLUME OPEN_PRICE  WHERE ( SYMBOL EQ '&SYMBOL.( ).SYMBOL.' ) AND ( CLOSE_DATE GT '&START_DATE.( ).yyyy-mm-dd.' ) AND ( CLOSE_DATE LT '&END_DATE.( ).yyyy-mm-dd.' );  ON TABLE SET PAGE-NUM NOLEAD  ON TABLE NOTOTAL  ON TABLE PCHOLD FORMAT HTML  ON TABLE SET HTMLCSS ON  ON TABLE SET STYLE *  INCLUDE = endeflt,  $  ENDSTYLE  END

Example – Focus to Hyperstage Compression Rows

Q&A Copyright 2007, Information Builders. Slide 33

STAR SCHEMA CONSIDERATIONS

Leverage the Knowledge Grid Do constrain the fact table directly Do use sub-selects instead of joins Do use date based constraints as much as possible Do add additional columns to create useful knowledge nodes Everyone wants to be a Star Adding as many WHERE conditions as you can to your SQL increases the chance that knowledge grid statistics can be used to increase the performance of your queries.