CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t Oracle Parallel Query vs Hadoop MapReduce for Sequential Data Access Zbigniew Baranowski.

Slides:

Advertisements

Similar presentations

Advertisements

Exadata Distinctives Brown Bag New features for tuning Oracle database applications.

LIBRA: Lightweight Data Skew Mitigation in MapReduce

Clydesdale: Structured Data Processing on MapReduce Jackie.

Evaluation of distributed open source solutions in CERN database use cases HEPiX, spring 2015 Kacper Surdy IT-DB-DBF M. Grzybek, D. L. Garcia, Z. Baranowski,

1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.

CERN IT Department CH-1211 Geneva 23 Switzerland t Sequential data access with Oracle and Hadoop: a performance comparison Zbigniew Baranowski.

Outline | Motivation| Design | Results| Status| Future

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte

SQL on Hadoop. Todays agenda Introduction Hive – the first SQL approach Data ingestion and data formats Impala – MPP SQL.

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.

Scale-out databases for CERN use cases Strata Hadoop World London 6 th of May,2015 Zbigniew Baranowski, CERN IT-DB.

Hadoop & Cheetah. Key words Cluster  data center – Lots of machines thousands Node  a server in a data center – Commodity device fails very easily Slot.

Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.

Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.

A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人：碩資工一甲董耀文.

Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.

SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.

CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.

CERN - IT Department CH-1211 Genève 23 Switzerland t The High Performance Archiver for the LHC Experiments Manuel Gonzalez Berges CERN, Geneva.

Zois Vasileios Α. Μ :4183 University of Patras Department of Computer Engineering & Informatics Diploma Thesis.

HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.

Cloud Computing Other High-level parallel processing languages Keke Chen.

Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.

Storage in Big Data Systems

A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.

Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.

Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark Cluster Monitoring 2.

Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.

Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.

CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.

Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.

Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.

CERN IT Department CH-1211 Geneva 23 Switzerland t IT/DB Tests and evolution SSD as flash cache.

Development of Hybrid SQL/NoSQL PanDA Metadata Storage PanDA/ CERN IT-SDC meeting Dec 02, 2014 Marina Golosova and Maria Grigorieva BigData Technologies.

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.

Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Database Competence Centre openlab Major Review Meeting nd February 2012 Maaike Limper Zbigniew Baranowski Luigi Gallerani Mariusz Piorkowski Anton.

Nov 2006 Google released the paper on BigTable.

Cloudera Kudu Introduction

Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,

Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.

Next Generation of Apache Hadoop MapReduce Owen

LIOProf: Exposing Lustre File System Behavior for I/O Middleware

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Matrix Multiplication in Hadoop

BIG DATA/ Hadoop Interview Questions.

Data Analytics and Hadoop Service in IT-DB Visit of Cloudera - April 19 th, 2016 Luca Canali (CERN) for IT-DB.

BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.

Integration of Oracle and Hadoop: hybrid databases affordable at scale

Scaling HDFS to more than 1 million operations per second with HopsFS

Integrating the R Language Runtime System with a Data Stream Warehouse

Integration of Oracle and Hadoop: hybrid databases affordable at scale

Sushant Ahuja, Cassio Cristovao, Sameep Mohta

Running virtualized Hadoop, does it make sense?

Scaling SQL with different approaches

Kay Ousterhout, Christopher Canel, Sylvia Ratnasamy, Scott Shenker

Hadoop Clusters Tess Fulkerson.

Oracle Storage Performance Studies

Case studies – Atlas and PVSS Oracle archiver

ASM-based storage to scale out the Database Services for Physics

Cse 344 May 2nd – Map/reduce.

Presentation transcript:

CERN IT Department CH-1211 Geneva 23 Switzerland t Oracle Parallel Query vs Hadoop MapReduce for Sequential Data Access Zbigniew Baranowski

Outline Motivation SW & HW architecture Test setup Results Summary 2

Outline Motivation SW & HW architecture Test setup Results Summary 3

Motivation for testing Relational model does not work well for –data analysis big data volumes random data access (and not by primary key) –indexes consumes too much space –table joins are inefficient –e. g. ATLAS Events Metadata, ATLAS DQ2 logs Full data scans (brute force) works best! 4

What do we want to test? Full Scan ? 5

Why Hadoop? Emerging technology… –Yahoo, Google (BigTable/GFS) … that people are interested in –ATLAS… Strong for data analytics and data warehousing –On this field Oracle can be hard to optimize and/or can be very expensive (Exadata) IT maybe about to provide a Hadoop service 6

Why Oracle? 30 years at CERN –a lot of data already in Oracle Full ACID transactional system Optimized for various data access methods 7

Compare apples to apples Comparable test for Hadoop and Oracle –simple data structure –simple data access (sequential) Application Oracle RDBMS Oracle I/O interface Hardware Application Map Reduce Hadoop I/O interface Hadoop I/O interface Hardware ? 8

Outline Motivation SW & HW architecture Test setup Results Summary 9

Shared Storage Shared Nothing Hardware architectureInterconnect Interconnect Interconnect Node 1Node 2Node n Storage 1Storage 2 Storage n Node 1 Node 2 Node n 10

Software architecture Map Reduce 11

Map Reduce Example 12

Software architecture Oracle Parallel Query 13 Table Blocks set #1 Blocks set #2 Blocks set #3 Blocks set #4 Blocks set #5 Reading and filtering Parallel Executors # Grouping and counting Parallel Executors #2 Aggregation Parallel Coordinator 1 1 Final result Degree Of Parallelism = 5

Parallel Query Example 14 Deer Bear River Car Car River Deer Car Bear Deer Bear River Car Car River Deer Car Bear Input table Deer Bear River Car Car River Deer Car Bear Splitting PE set#1 Reading, joining and sub-counting Deer,1 Bear,1 River,1 Deer,1 Bear,1 River,1 Car,2 River,1 Car,2 River,1 Deer,1 Car,1 Bear,1 Deer,1 Car,1 Bear,1 Bear,2 Car,3 Bear,2 Car,3 Deer,2 River,2 PE set#2 Counting and sorting Coordinator Agregating Bear,2 Car,3 Deer,2 River,2 Bear,2 Car,3 Deer,2 River,2 Degree Of Parallelism = 3

Outline Motivation SW & HW architecture Test setup Results Summary 15

Hardware used 16

Hardware config for Hadoop Shared nothing implementation 17

Hardware config for Oracle Shared storage implementation 18

Software configuration Hadoop –v1.0.3 –Cluster file system: HDFS on top of ext3 –More specific configuration/optimization Delayed Fair Scheduler JVM reuse File block size 1GB … Oracle database –12c RAC –Cluster file system: Oracle’s ASM –Configuration recommended by Oracle 19

Data set used Could be generic, but… How do we know if we got the right results? 1TB CSV file with physics data –Numeric values (integers, floats) –532M lines/records –244 columns/attributes per record/line 20

Test case Counting „exotic” muons in the collection –Oracle version –MapReduce select /*+ parallel(x) */ count(*) from iotest.muon1tb where col93> and col20=1; //MAP body LongWritable one = new LongWritable(1); NullWritable nw = NullWritable.get(); String[] records = line.split(','); tight = Integer.parseInt(records[19]); pt = Float.parseFloat(records[92]); if (tight==1 && pt>100000) { collector.collect(nw,one); } //REDUCER body long sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(NullWritable.get(), new LongWritable(sum)); 21

Outline Motivation SW & HW architecture Test setup Results Summary 22

Results 23

Step I Optimize read workload using one machine –2x4 cores –16GB RAM –1x Storage, controller speed 4 Gbps 24

Map Reduce results (I) Running proccessing one a single node – CPU bound 25 Number of CPU cores Number of disks

Map Reduce results (II) Running read only proccessing one a sigle node – IO bound 26 Number of CPU cores Number of disks

Map Reduce results (III) Running python MR ( Hadoop Streaming ) processing one a sigle node – CPU bound 27 Number of CPU cores Number of disks

Map Reduce results (IV) Running optimized java MR proccessing one a sigle node 28 Number of CPU cores Number of disks

Running Java optimized MR on LZO compressed data (568GB) – CPU bound Map Reduce results 29 Number of CPU cores Number of disks

What we have learnt so far Optimized reading for single node –High CPU utilisation –100% of available memory utilised by Java –Managed to deliver performance close to maximum capability of the hardware 30

Step II Scaling up –5 nodes –5 storages, FC connection 4gbps –parallelism degree = 8 31

Map Reduce Results Scaling the cluster 32

Results 33

Step 0 Data design - there are few possibilites of storing CSV file in a database –Single column table all attributes as a CSV string –Multi column table different column data types (NUMBERS, STRING etc) columns can be compressed Storage typeSize Single column [STRING]1204 GB Multi column [NUMBERS]580 GB Multi column [NUMBERS] compressed418 GB Multi column [STRINGS]1203 GB Multi column [STRINGS] compressed956 GB 34

Step I Optimize read workload using one machine 35

Oracle results PQ on single instance/machine attached to shared storage Number of CPU cores 36

What we have learnt so far Data processing is IO bound –CPU is not heavily utilized –async and direct IO are used –memory utilisation under control Server’s FC controller speed throttles the IO throughput –Max 750 MB/s When adding more machines -> more FC controllers = better IO performance should be observed 37

Step II Scaling up –5 nodes –Parallelism degree = 8 –5 storages in shared configuration, FC connection 4gbps 38

Oracle results (II) PQ on cluster database (RAC) 39

Outline Motivation SW & HW architecture Test setup Results Summary 40

Comparision (I) - Throughput Similar results for Oracle and Hadoop. –Less than theoretical max throughput in both cases –Understood why the difference Throughput is dominated by disk read contention. 41

Comparision (II) - Time Oracle’s implementation can have lower data processing time. Explained by optimal data representation. 42

Comparison (III) – CPU utilisation Hadoop consumes a lot of CPU and memory –for more complex processing it might be an issue Oracle is IO-bound 43

Conclusions What is faster?What is faster? –Oracle performs well for small clusters –Hadoop slightly slower, but… What scales better?What scales better? –Oracle scaling ability is limited by a shared storage –Hadoop scales very well We can guess that on a bigger cluster should outperform Oracle Cost-performanceCost-performance –Hadoop license vs Oracle license –Shared nothing vs Shared storage 44

Writing efficient Map Reduce code is not trivial –optimized implementation can significantly improve the performance Hadoop is still in very active develop –new releases can address observed issues A lot of data at CERN is already in Oracle –Cost of exporting it to other system is not negligible Valuable experience gained for helping others with Hadoop Map Reduce implementations Comments 45

Luca Canali Andrei Dumitru Eric Grancher Maaike Limper Stefano Russo Giacomo Tenaglia Many Thanks! 46

Q&A 47

Backup slides 48

How fast HDFS could be? MR scheduler is not „hard drives aware” –Multiple processes are reding from same devices 49

Improved scheduler Custom MR implementation in python –scheduling „disk aware” Better performance achived –Custom Python MR : 1,7 GB/s –Optimized Java MR: 1,52 GB/s –Python MR: 1,3 GB/s 50