1 / 8

Pig from Alan Gates’ book (In preparation for exam2)

Pig from Alan Gates’ book (In preparation for exam2). Introduction. Download pig from pig.apache.org (into timberlake or your local computer/laptop) Unzip and untar it. You are set to go.

gavivi
Download Presentation

Pig from Alan Gates’ book (In preparation for exam2)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pig from Alan Gates’ book(In preparation for exam2)

  2. Introduction • Download pig from pig.apache.org (into timberlake or your local computer/laptop) • Unzip and untar it. You are set to go. • You can execute in local mode for learning purposes. Later on you can test it on your hadoop installation. • Navigate to the director where pig is installed. ./bin/pig –x local • Will put you in grunt mode or local mode

  3. Data and pig Script • Create a data (called data) directory in the directory where bin is located. • Download from github all the data files related to pig book and store in the data directory • NYSE_divdidends • NYSE_daily • Etc. • Now go thru’ the examples in chapters 1-4, either by typing them in line by line or by creating script files. • Mystockanalysis.pig can be executed by • ./bin/pig –x local Mystockanalysis.pig or line by line on grunt

  4. Chapter 1 • Hello world of pig. • Mary had little lamb example. • Go through the example in page.3 • Create “mary” file in your data directory • Type in the commands line by line as in p.3 • Now create a ch1.pig file out of the coammands • Run the script file using the pig command • Try some other commands not listed there. • Understand the examples discussed in p.5,6

  5. Chapter 2 • Discusses installing and running pig • Go through the example in p.14. • That’s all.

  6. Chapter 3 • Discuss the grunt shell that is the prompt for the local mode • pig –x local • Results in grunt grunt> • See the example in page 20

  7. Chapter 4 • Pig data model • Scalars like: int, long, float, double, etc. • Complex types: Map, chararray to element mapping, sort of like key, value pair • Tuple ordered collection of Pig elements (‘bob, 55) • Bag is an unordered collection of tuples • Nulls • Schemas: Pig has lax attitude towards schemas • Explicit: • dividends = load ‘NYSE_dividends’ as (exchange:chararray, symbol:chararray, date: chararray, dividend:float); • Or you could say • divs = load ‘NYSE_dividends’ as (exchange, symbol, date, dividend); • See the table on page 28 • See the example p.28,29,30.

  8. Chapter 5 • Pig Latin • Look at the examples p.33-50 • Commands discussed are: • Load, store, dump • Relational operations: foreach, filter, group, order ..by, distinct, join • Data operation: limit, sample, parallel.

More Related