Actores y actrices
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

Actores y Actrices PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

Actores y Actrices. Peligro. Please be careful!. IMDb (I assume you all know?). I MDb Dump. Not open/free!. The Question You are Going to Answer …. Which pair of actors/actresses have acted together the most times?. An Example.

Download Presentation

Actores y Actrices

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Actores y actrices

Actoresy Actrices


Peligro

Peligro

  • Please be careful!


Imdb i assume you all know

IMDb (I assume you all know?)


I mdb dump

IMDb Dump

Not open/free!


The question you are going to answer

The Question You are Going to Answer …

Which pair of actors/actresses have acted together the most times?


An example

An Example

In how many movies have Al Pacino and Robert Di Nero starred together in IMDb?

?


Imdb typical file

IMDB: Typical File

  • Log into machine cluster.dcc.uchile.cl

  • Username: uhadoop

  • zcat/data/hadoop/hadoop/data/imdb/actors.list.gz | more


Imdb already parsed

IMDb: Already Parsed

zcat /data/hadoop/hadoop/data/imdb/tsv/actpersons-to-movies.tsv.gz | more

How many theatrical

movies was Uma Thurman

in?

zcat/data/hadoop/hadoop/data/imdb/tsv/actresses-to-movies.tsv.gz | grep -e “^Thurman, Uma” | grep -e “THEATRICAL_MOVIE” | wc -l


The question you are going to answer1

The Question You are Going to Answer …

Which pair of actors/actresses have acted together the most times?


1 download the project

1. Download the project

http://aidanhogan.com/teaching/cc5212-1/mdp-lab5.zip


2 implement the hadoop job s

2. Implement the Hadoop job(s)!

  • Adapt WordCount example

    • Refer to lab slides from last week

  • Can use class file for each part of the task

  • Test on small file

    • /uhadoop/imdb/actpersons-to-movies.100k.tsv

  • Run on big file

    • /uhadoop/imdb/full/actpersons-to-movies.tsv

  • Write to your directory!!!

    • /uhadoop/[username]


3 continuation

3. Continuation

  • Count the pairs

    • CountPairs.java

  • Sort the pairs

    • SortPairs.java

  • Figure out the input

  • Figure out the map/reduce phase

  • Adapt a previous example

    • WordCount or EmitPairs

    • Change generics

    • Implement new Map/Reduce

  • Run it!


  • Login