1 / 9

Introduction to Sqoop

Introduction to Sqoop. Table of Contents. Sqoop - Introduction. Integration of RDBMS and Sqoop. Sqoop use case. Sample sqoop commands. Key features of Sqoop. What is Sqoop ?. Sqoop is … a suite of tools that connect Hadoop and database systems Major functions of Sqoop

makya
Download Presentation

Introduction to Sqoop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Sqoop

  2. Table of Contents Sqoop - Introduction Integration of RDBMS and Sqoop Sqoop use case Sample sqoop commands Key features of Sqoop

  3. What is Sqoop? Sqoop is … a suite of tools that connect Hadoopand database systems Major functions of Sqoop • Import tables from databases into HDFS for deep analysis • Replicate database schemas in Hive’s metastore • Export MapReduce results back to a database for presentation to end-users

  4. RDBMS important but vulnerable? Importance of RDBMS • Holds a lot of valuable data in the form of structured tables of several hundred GB • Provides fast access for OLTP applications like Update / delete records, Add individual records, Complex transactions Vulnerability • Can’t store very large datasets (1 TB+) • Poor support for complex datatypes/ large objects • Schema evolution is hard • Analytic queries better suited to a batch-oriented system

  5. RDBMS and Hadoop RDBMS Historical data (before processing) HDFS Results of data Analysis (after processing)

  6. Sqoop use case : Demographics-aware site analytics

  7. Sample Sqoop commands Import using Sqoop Export using Sqoop sqoop export --connect jdbc:mysql://db.foo.com/corp --table ads_results --export-dir results • sqoop import --connect jdbc:mysql://db.foo.com/corp --table user-profiles JDBC mysql driver Output : mysql table Input : mysql table Hdfs location with analysis results

  8. Key features of Sqoop JDBC-based implementation - Works with many popular database vendors Auto-generation of tedious user-side code - Writing MapReduce applications to work with data, faster Integration with Hive - Allows to stay in a SQL-based environment

  9. THANK YOU

More Related