© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 HiVertica Capstone Project.

Slides:



Advertisements
Similar presentations
HBase and Hive at StumbleUpon
Advertisements

Syncsort Data Integration Update Summary Helping Data Intensive Organizations Across the Big Data Continuum Hadoop – The Operating System.
©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ©2011 Hewlett-Packard Development.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
© 2013 MediaCrossing, Inc. All rights reserved. Going Live: Preparing your first Spark production deployment Gary Malouf Architect,
HP Asset Hub Support through Service Central
Your mission: Change Business Outcomes with SAP HANA
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. TOSCA Requirements & Capabilities.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Verify the quality and.
Cloud Computing Other Mapreduce issues Keke Chen.
PARALLEL DBMS VS MAP REDUCE “MapReduce and parallel DBMSs: friends or foes?” Stonebraker, Daniel Abadi, David J Dewitt et al.
Session-01. Hibernate Framework ? Why we use Hibernate ?
Hive – A Warehousing Solution Over a Map-Reduce Framework Presented by: Atul Bohara Feb 18, 2014.
Copyright 2004, SPSS Inc. 1 Using the SPSS MR Data Model Sam Winstanley Solution Architect - SPSS 21 st January 2004.
SQL on Hadoop. Todays agenda Introduction Hive – the first SQL approach Data ingestion and data formats Impala – MPP SQL.
Hadoop & Cheetah. Key words Cluster  data center – Lots of machines thousands Node  a server in a data center – Commodity device fails very easily Slot.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Analytics Map Reduce Query Insight Hive Pig Hadoop SQL Map Reduce Business Intelligence Predictive Operational Interactive Visualization Exploratory.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential Document.
Oracle Application Express (Oracle APEX), formerly called HTML DB, is a Free rapid web application development tool for the Oracle database.
H ADOOP DB: A N A RCHITECTURAL H YBRID OF M AP R EDUCE AND DBMS T ECHNOLOGIES FOR A NALYTICAL W ORKLOADS By: Muhammad Mudassar MS-IT-8 1.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. How to download HP Marketing.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential User Interfaces.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Self-guided tour Framework.
Hive : A Petabyte Scale Data Warehouse Using Hadoop
Introduction to Hadoop and HDFS
HadoopDB Presenters: Serva rashidyan Somaie shahrokhi Aida parbale Spring 2012 azad university of sanandaj 1.
Distributed Systems Fall 2014 Zubair Amjad. Outline Motivation What is Sqoop? How Sqoop works? Sqoop Architecture Import Export Sqoop Connectors Sqoop.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Access Control Personal.
Hive Facebook 2009.
Stephen Booth EPCC Stephen Booth GridSafe Overview.
Enabling data management in a big data world Craig Soules Garth Goodson Tanya Shastri.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
An Introduction to HDInsight June 27 th,
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Fleet Service Responder.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential Scan to.
Indexing HDFS Data in PDW: Splitting the data from the index VLDB2014 WSIC、Microsoft Calvin
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Data and SQL on Hadoop. Cloudera Image for hands-on Installation instruction – 2.
Large scale IP filtering using Apache Pig and case study Kaushik Chandrasekaran Nabeel Akheel.
Programming in R SQL in R. Running SQL in R In this session I will show you how to: Run basic SQL commands within R.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 Vertica to HDFS Capstone.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential Level.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. TOSCA 115 Capability Interfaces.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Introducing ASAP Hybrid January,
Talentlink Reporting This should be the first page of your presentation.
Hive. What is Hive? Data warehousing layer on top of Hadoop – table abstractions SQL-like language (HiveQL) for “batch” data processing SQL is translated.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 Automate your way to.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg.
© Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Restricted Module 11.
© Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Restricted Module 8.
Impala. Impala: Goals General-purpose SQL query engine for Hadoop High performance – C++ implementation – runtime code generation (using LLVM) – direct.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX IT Tools Test Client Jean-Francois LEBLANC Christian SEBASTIAN.
High-Performance Querying on RAW data Anastasia Ailamaki EPFL.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Release date: January,
3 DAYS ON JANUARY 16 th, 17 th & 18 th 2015 Santa Clara Convention Center, 5001 Great America Parkway, Santa Clara, CA 95054, United States.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Agile Manger Beta Registration.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. PPT Version 3 | Content.
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. TOSCA 115 Capability Interfaces.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Getting to Blue Carpet.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Database Growth: Problems & Solutions.
© Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Restricted July 2011.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.
HPE Big Data Platform Software Portfolio.
Sqoop Mr. Sriram
Power Apps & Flow for Microsoft Dynamics SL
Server & Tools Business
Presentation transcript:

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 HiVertica Capstone Project University of Pittsburgh January 11, 2013 Stephen Walkauskas, Architect, Data Management, Vertica

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 2 Contact info Stephen Walkauskas

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 3 Vertica culture

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 4 What Is Vertica SQL Database for Real-time Analytics Runs on x86 hardware MPP Columnar Architecture – scales to PBs! Reduced footprint via Advanced Compression Extensible analytics capabilities Easy to setup and use Elastic - grow/shrink as needed Extensive Ecosystem of analytic tools Speed Scale Simplicity

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 5 Map/Reduce

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice HQL SELECT a.val1, a.val2, b.val, c.val FROM a JOIN b ON (a.key = b.key) LEFT OUTER JOIN c ON (a.key = c.key)

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 7 HiVertica

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 8 HiVertica a) Write code to read Hive / HCatalog meta-data and generate DDL to create corresponding external tables (ETs) in a Vertica DB. b) Configure ETs with files referenced by the corresponding Hive tables. Vertica ships a connector to source files from hdfs. Using this connector the aforementioned ETs can be used to query data in Hive (assuming data is in a format Vertica can parse).

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 9 HiVertica c) Vertica supports User Defined Parsers (you can write your own csv parser if you’re so inclined). RCFile is commonly used to store data in Hive. It would be useful to be able to parse that format in a Vertica UDParser. d)Find that place in Hive where it compiles HQL into M/R jobs and instead rename the HQL to SQL and, leveraging the above features, send the query to Vertica instead. The two systems are not 100%; we can tweak them to shrink the feature gap.

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 10 Thanks!