1 / 9

An introduction to Apache Sqoop

An introduction to Apache Sqoop, what is it ?How does it assist in large volume data transfer between Hadoop and external sources ?

semtechs
Download Presentation

An introduction to Apache Sqoop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Apache Sqoop • What is it ? • How does it work ? • Interfaces • Example • Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  2. Scoop – What is it ? • A command line interface • ( plus web in scoop2 )‏ • For data import / export to Hadoop • Uses Map jobs from Map Reduce • Supports incremental loads • Written in Java • Licensed by Apache • Uses plugins for new types of data source www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  3. Scoop – How does it work ? • Data sliced into partitions • Mappers transfer data • Data types determined via meta data • Many data transfer formats supported • i.e. CSV, Avro • Can import into • Hive ( use --hive-import flag )‏ • Hbase ( use –hbase* flags )‏ www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  4. Scoop – Interfaces • Get data from • Relational databases • Data warehouses • NoSQL databases • Load to Hive and Hbase • Integrates with Oozie • for scheduling www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  5. Scoop – Example An example scoop command to • load data from mySql into Hive bin/sqoop-import --connect jdbc:mysql://<mysql host>:<msql port>/db3 \ -username <username> \ -password <password> \ --table <tableName> \ --hive-table <Hive tableName> \ --create-hive-table \ --hive-import \ --hive-home <hive path> www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  6. Scoop – Architecture Scoop has moved from • Scoop1 to Scoop 2 • Changed from client to server install • Now has web and command line access • Server now accesses Hive & Hbase • Oozie uses REST API www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  7. Scoop – Architecture - Scoop1 www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  8. Scoop – Architecture - Scoop2 www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  9. Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems

More Related