Bringing Value of Big Data to Business: SAP's Integrated Strategy [1] Group 6 - Ziqi Fan, Sheng Chen.

Slides:



Advertisements
Similar presentations
1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
Advertisements

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
CS 540 Database Management Systems
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
Parallel Database Systems
Parallel Database Systems The Future Of High Performance Database Systems David Dewitt and Jim Gray 1992 Presented By – Ajith Karimpana.
1 HYRISE – A Main Memory Hybrid Storage Engine By: Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden, VLDB.
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
CS 347Notes 041 CS 347: Distributed Databases and Transaction Processing Notes04: Query Optimization Hector Garcia-Molina.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Inspector Joins IC-65 Advances in Data Management Systems 1 Inspector Joins By Shimin Chen, Anastassia Ailamaki, Phillip, and Todd C. Mowry VLDB 2005 Rammohan.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
Case Study V: Help Desk Service CSCI 8710 Fall 2008.
Graph Algebra with Pattern Matching and Aggregation Support 1.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Lecture 11: DMBS Internals
Optimizing Queries and Diverse Data Sources Laura M. Hass Donald Kossman Edward L. Wimmers Jun Yang Presented By Siddhartha Dasari.
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands Stefan.
Breaking the Memory Wall in MonetDB
Ashwani Roy Understanding Graphical Execution Plans Level 200.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
The Central Processing Unit
@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research.
Section one revision:1. Computer Systems To be able to Identify and describe computer systems To demonstrate an understanding of the Central Processing.
Introduction to Computer Architecture. What is binary? We use the decimal (base 10) number system Binary is the base 2 number system Ten different numbers.
Stefan Thorvaldsson – Can you touch it? HardwareSoftware KeyboardMonitorProcessorSpeakersMouse Fixed hard drives ROM/RAMPrinter Web.
Self-Managing Cost Models Shivnath Babu Stanford University.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Moore’s Law means more transistors and therefore cores, but… CPU clock rate stalled… Meanwhile RAM cost continues to drop.
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CSCE Database Systems Chapter 15: Query Execution 1.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Enterprise Solutions Chapter 11 – In-memory Technology.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
CS 540 Database Management Systems
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
Indexing strategies and good physical designs for performance tuning Kenneth Ureña /SpanishPASSVC.
How is data stored? ● Table and index Data are stored in blocks(aka Page). ● All IO is done at least one block at a time. ● Typical block size is 8Kb.
CS 540 Database Management Systems
CS 704 Advanced Computer Architecture
Catalog of useful (structural) modules and architectures
ECE354 Embedded Systems Introduction C Andras Moritz.
CS 440 Database Management Systems
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
The Client/Server Database Environment
Lecture 11: DMBS Internals
Computer Architecture
Join Processing in Database Systems with Large Main Memories (part 2)
April 30th – Scheduling / parallel
Physical Join Operators
Fundamentals of Computer Organisation and Architecture
Computer system 돈벌자
Outline Introduction Background Distributed DBMS Architecture
Introduction to Computer Architecture
GCSE OCR 3 Memory Computer Science J276 Unit 1
(A Research Proposal for Optimizing DBMS on CMP)
Presentation transcript:

Bringing Value of Big Data to Business: SAP's Integrated Strategy [1] Group 6 - Ziqi Fan, Sheng Chen

SAP’s Integrated Big Data Strategy SAP is attempting to create an integrated approach that allows companies to perform all the following operations in one environment –Analytics; –Make big data operational; –Support applications for high resolution management.

Architecture Vision of SAP’s Integrated Big Data

SPA HANA [2] SAP HANA, an in memory database is the key to SAP’s integrated strategy. HANA DB takes advantage of the low cost of main memory (RAM), data processing abilities of multi- core processors and the fast data access of solid- state drives relative to traditional hard drives to deliver better performance of analytical and transactional applications.

SPA HANA [2] It offers a multi-engine query processing environment which allows it to support both relational data as well as graph and text processing for semi- and unstructured data management within the same system. HANA DB is 100% ACID compliant.

Main-Memory DB Query Optimization [3] Logical Optimization –Almost same like that in conventional database Physical Optimization –goal : minimize execution costs with respect to a given cost model –Quite different from that in conventional database due to lack of I/O as dominant cost factor A “simple” cost model T = T Mem + T CPU

Main-Memory DB Query Optimization CPU Cost T CPU = c 0 + c 1 · n + c 2 · m c 0 - fix startup costs c 1 - per tuple costs for processing input tuples c 2 - per tuple costs for producing output tuples n - # input tuples m - # output tuples

Main-Memory DB Query Optimization Memory Access Cost M i s - # cache miss of level i for sequential access M i r - # cache miss of level i for random access l i s - cache latency of level i for sequential access l i r - cache latency of level i for random access Estimating M i s and M i r is very difficult !

Main-Memory DB Query Optimization Basic Access Pattern –single sequential traversal –repetitive sequential traversal –single random traversal –random access –etc. Compound Access Pattern –Nested loop Join –Hash-join –etc.

Reference [1] Dan Woods, “Bringing Value of Big Data to Business: SAP's Integrated Strategy”, Forbes, 01/05/ value-of-big-data-to-business-saps-integrated-strategy/ [2] [3] Manegold S.: Understanding, Modeling, and Improving Main- Memory Database Performance, SIKS Dissertation Series No , ISBN , pp