Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pig from Alan Gates’ book (In preparation for exam2)

Similar presentations


Presentation on theme: "Pig from Alan Gates’ book (In preparation for exam2)"— Presentation transcript:

1 Pig from Alan Gates’ book (In preparation for exam2)

2 Introduction Download pig from pig.apache.org (into timberlake or your local computer/laptop) Unzip and untar it. You are set to go. You can execute in local mode for learning purposes. Later on you can test it on your hadoop installation. Navigate to the director where pig is installed. ./bin/pig –x local Will put you in grunt mode or local mode

3 Data and pig Script Create a data (called data) directory in the directory where bin is located. Download from github all the data files related to pig book and store in the data directory NYSE_divdidends NYSE_daily Etc. Now go thru’ the examples in chapters 1-4, either by typing them in line by line or by creating script files. Mystockanalysis.pig can be executed by ./bin/pig –x local Mystockanalysis.pig or line by line on grunt

4 Chapter 1 Hello world of pig. Mary had little lamb example.
Go through the example in page.3 Create “mary” file in your data directory Type in the commands line by line as in p.3 Now create a ch1.pig file out of the coammands Run the script file using the pig command Try some other commands not listed there. Understand the examples discussed in p.5,6

5 Chapter 2 Discusses installing and running pig
Go through the example in p.14. That’s all.

6 Chapter 3 Discuss the grunt shell that is the prompt for the local mode pig –x local Results in grunt grunt> See the example in page 20

7 Chapter 4 Pig data model Scalars like: int, long, float, double, etc.
Complex types: Map, chararray to element mapping, sort of like key, value pair Tuple ordered collection of Pig elements (‘bob, 55) Bag is an unordered collection of tuples Nulls Schemas: Pig has lax attitude towards schemas Explicit: dividends = load ‘NYSE_dividends’ as (exchange:chararray, symbol:chararray, date: chararray, dividend:float); Or you could say divs = load ‘NYSE_dividends’ as (exchange, symbol, date, dividend); See the table on page 28 See the example p.28,29,30.

8 Chapter 5 Pig Latin Look at the examples p.33-50
Commands discussed are: Load, store, dump Relational operations: foreach, filter, group, order ..by, distinct, join Data operation: limit, sample, parallel.


Download ppt "Pig from Alan Gates’ book (In preparation for exam2)"

Similar presentations


Ads by Google