Hive UDF content/uploads/downloads/2013/09/HWX.Qu bole.Hive_.UDF_.Guide_.1.0.pdf UT Dallas 1.

Slides:



Advertisements
Similar presentations
Introduction to Apache HIVE
Advertisements

Introduction to Java 2 Programming Lecture 4 Writing Java Applications, Java Development Tools.
Introduction to Java 2 Programming Lecture 3 Writing Java Applications, Java Development Tools.
Hui Li Pig Tutorial Hui Li Some material adapted from slides by Adam Kawa the 3rd meeting of WHUG June 21, 2012.
CS525: Special Topics in DBs Large-Scale Data Management MapReduce High-Level Langauges Spring 2013 WPI, Mohamed Eltabakh 1.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Lecture VII: 2014/04/21.
Chapter 11 Group Functions
Let’s try Oracle. Accessing Oracle The Oracle system, like the SQL Server system, is client / server. For SQL Server, –the client is the Query Analyser.
CSE 1561 A Brief MySQL Primer Stephen Scott. CSE 1562 Introduction Once you’ve designed and implemented your database, you obviously want to add data.
Lecture 27 Exam outline Boxing of primitive types in Java 1.5 Generic types in Java 1.5.
Object-Oriented Enterprise Application Development Javadoc Last Updated: 06/30/2001.
Winter 2005Jason Prideaux1 Apache ANT A platform independent build tool for Java programs.
1 Table Alteration. 2 Altering Tables Table definition can be altered after its creation Adding columns Changing columns’ definition Dropping columns.
Microsoft Access 2010 Chapter 7 Using SQL.
CS525: Big Data Analytics MapReduce Languages Fall 2013 Elke A. Rundensteiner 1.
Sorting data and Other selection Techniques Ordering data results Allows us to view our data in a more meaningful way. Rather than just a list of raw.
Application Development On AWS MOULIKRISHNA KOPPOLU CHANDAN SINGH RANA.
Chapter 2: Variables, Operations, and Strings CSCI-UA 0002 – Introduction to Computer Programming Mr. Joel Kemp.
MIS 3200 – Unit 6.2 Learning Objectives How to move data between pages – Using Query Strings How to control errors on web pages – Using Try-catch.
Programming with Microsoft Visual Basic 2012 Chapter 13: Working with Access Databases and LINQ.
Introduction to Databases Chapter 6: Understanding the SQL Language.
Cloud Computing Other High-level parallel processing languages Keke Chen.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
Exploring Microsoft Access Chapter 4 Relational Databases, External Data, Charts, and the Switchboard.
Hive Facebook 2009.
MapReduce High-Level Languages Spring 2014 WPI, Mohamed Eltabakh 1.
5/24/01 Leveraging SQL Server 2000 in ColdFusion Applications December 9, 2003 Chris Lomvardias SRA International
A NoSQL Database - Hive Dania Abed Rabbou.
Hive – SQL on top of Hadoop
Presented by Priagung Khusumanegara Prof. Kyungbaek Kim
Javadoc: Advanced Features & Limitations Presented By: Wes Toland.
MAP-REDUCE ABSTRACTIONS 1. Abstractions On Top Of Hadoop We’ve decomposed some algorithms into a map-reduce “workflow” (series of map-reduce steps) –
Programming Fundamentals 2: Simple/ F II Objectives – –give some simple examples of Java applications and one applet 2. Simple Java.
Concepts of Database Management Eighth Edition Chapter 3 The Relational Model 2: SQL.
Chapter Thirteen Working with Access Databases and LINQ Programming with Microsoft Visual Basic th Edition.
What are queries? Queries are a way of searching for and compiling data from one or more tables. Running a query is like asking a detailed question of.
Advanced SELECT Queries CS 146. Review: Retrieving Data From a Single Table Syntax: Limitation: Retrieves "raw" data Note the default formats… SELECT.
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
Lecture 8 – SQL Joins – assemble new views from existing tables INNER JOIN’s The Cartesian Product Theta Joins and Equi-joins Self Joins Natural Join.
1 Chapter 3 – Examples The examples from chapter 3, combining the data types, variables, expressions, assignments, functions and methods with Windows controls.
1 Reverse a String iPhone/iPad, iOS Development Tutorial.
Chapter 11: Advanced Inheritance Concepts. Objectives Create and use abstract classes Use dynamic method binding Create arrays of subclass objects Use.
Apache PIG rev Tools for Data Analysis with Hadoop Hadoop HDFS MapReduce Pig Statistical Software Hive.
DAY 18: ACCESS CHAPTER 3 Tazin Afrin October 22,
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
Oracle & SQL. Oracle Data Types Character Data Types: Char(2) Varchar (20) Clob: large character string as long as 4GB Bolb and bfile: large amount of.
Sorting data and Other selection Techniques Ordering data results Allows us to view our data in a more meaningful way. Rather than just a list of raw.
CHAPTER 7 LESSON C Creating Database Reports. Lesson C Objectives  Display image data in a report  Manually create queries and data links  Create summary.
Programming for the Web MySQL Command Line Using PHP with MySQL Dónal Mulligan BSc MA
What is Pig ???. Why Pig ??? MapReduce is difficult to program. It only has two phases. Put the logic at the phase. Too many lines of code even for simple.
Create Stored Procedures and Functions Database Management Fundamentals LESSON 2.4.
Database Design lecture 3_2 Slide 1 Database Design Lecture 3_2 Data Manipulation in SQL Simple SQL queries References: Text Chapter 8 Oracle SQL Manual.
HIVE A Warehousing Solution Over a MapReduce Framework
Mail call Us: / / Hadoop Training Sathya technologies is one of the best Software Training Institute.
Databases.
MYSQL DEFINITION MySQL, pronounced either "My S-Q-L" or "My Sequel," is an open source relational database management system. It is based on the structure.
Plug-In T7: Problem Solving Using Access 2007
Hive Mr. Sriram
Projects on Extended Apache Spark
MENAMPILKAN DATA DARI SATU TABEL (Chap 2)
Chapter 4 Summary Query.
Database systems Lecture 3 – SQL + CRUD
Lecture 18 (Hadoop: Programming Examples)
Introduction to javadoc
Functions continued.
5/8/2019 3:20 AM bQuery-Tool 3.0 A new and elegant way to create queries and ad-hoc reports on your Baan/Infor ERP LN data. This Baan session is a query.
Trainer: Bach Ngoc Toan– TEDU Website:
05 | Processing Big Data with Hive
Copyright © JanBask Training. All rights reserved Get Started with Hadoop Hive HiveQL Languages.
Presentation transcript:

Hive UDF content/uploads/downloads/2013/09/HWX.Qu bole.Hive_.UDF_.Guide_.1.0.pdf UT Dallas 1

UDF: User-Defined Functions UDF is a Great tool for extending HiveQL. Written in Jave and then integrated to Hive as built-in functions. Can be called from a Hive query. Hive Built-in functions: hive> SHOW FUNCTIONS; hive> DESCRIBE FUNCTION concat; hive> DESCRIBE FUNCTION EXTENDED concat; concat(str1, str2,... strN) - returns the concatenation of str1, str2,... strN Returns NULL if any argument is NULL. Example: > SELECT concat('abc', 'def') FROM src LIMIT 1; 'abcdef' 2

UDF cont’d SELECT concat(column1,column2) AS x FROM table; Standard Functions round(), floor(), abs() ucase(), reverse() Aggregate Functions sum(), avg(), count(), min() and max() 3

UDF cont’d UDTFs: User Defined Table Generating Functions hive> select split(bday, '-') as bd_func from littlebigdata; ["2","12","1981"] ["10","10","2004"] ["4","5","1974"] hive> select explode(split(bday, '-')) as bd_func from littlebigdata;

Custom UDF Example my_to_upper function We will use the following: File name: littlebigdata.txt with the following content: edward sara hive > CREATE TABLE IF NOT EXISTS littlebigdata( name STRING, STRING, bday STRING, ip STRING, gender STRING, anum INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; hive> LOAD DATA LOCAL INPATH ‘unix/path/to/littlebigdata.txt' INTO TABLE littlebigdata; 5

Java code import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.hive.ql.exec.Description; import = "my_to_upper", value = "_FUNC_(str) - Converts a string to uppercase", extended = "Example:\n" + " > SELECT my_to_upper(author_name) FROM authors a;") public class ToUpper extends UDF { public Text evaluate(Text s) { Text to_value = new Text(""); if (s != null) { try { to_value.set(s.toString().toUpperCase()); } catch (Exception e) { // Should never happen to_value = new Text(s); } return to_value; } 6

Java code Extend UDF class and write the evaluate() function. evalute() methods can be is an optional Java annotation for DESCRIBE FUNCTION... command. _FUNC_ strings will be replaced with the function name. Arguments and return types are what Hive can serialze (e.g., for numbers, use int, Integer wrapper object, or IntWritable which Hadoop wrapper for integers). In previous example we used Text 7

Compile, JAR and Create func. In the Unix shell: $ mkdir udf_classes_toUpper; $ javac -classpath /usr/local/hive-0.9.0/lib/hive-exec jar:/usr/local/hadoop-1.2.1/hadoop-core jar -d udf_classes_toUpper ToUpper.java $ jar -cvf toupper.jar -C udf_classes_toUpper/. In the Hive shell: hive> add jar /people/cs/l/lkhan/toupper.jar; hive> CREATE TEMPORARY FUNCTION my_to_upper as 'ToUpper'; -- ToUpper is the Jave class name 8

Function use hive> desc function extended my_to_upper; my_to_upper(str) - Converts a string to uppercase Example: > SELECT my_to_upper(author_name) FROM authors a; hive> select name, my_to_upper(name) from littlebigdata; edward capriolo EDWARD CAPRIOLO bob BOB sara connor SARA CONNOR 9

Dropping a temp UDF hive> DROP TEMPORARY FUNCTION IF EXISTS my_to_upper; To make a function permanent: Code should be added to Hive source code (FunctionRegistry class) Rebuild Hive and redeploy. 10

Thank you Programming Hive book and-serdes/ and-serdes/ 11