1 / 10

An introduction to Apache Accumulo

A short introduction to Apache Accumulo. What is it and how does it relate to big table ? How does it use Hadoop,Zookeeper and Thrift in its implementation ?

semtechs
Download Presentation

An introduction to Apache Accumulo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Apache Accumulo • What is it ? • Design • Integrity • Administration • Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  2. Accumulo – What is it ? • A key / value store • A column oriented database • Based on Google's Big Table • Based on • Apache Hadoop • Apache Zoo Keeper • Apache Thrift • Written in Java • Licensed by Apache www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  3. Accumulo – Design • Has cell level security via column visibility • Server side programming created via iterators • Table based constraints written in Java • Sharding can be used for parallel doc storage • Large rows can be larger than memory size www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  4. Accumulo – Integrity • Zookeeper used to manage master fail over • Write ahead logs written to each server • Logical time managed for • Consistant transactions • Bulk data import • Fate transactions ( Fault Tolerant Transactions )‏ • Transactions complete even after master failure • Isolation • Transactions see a consistant view of data at row level www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  5. Accumulo – Administration • System monitoring and stats via web page • System and table config stored in Zoo Keeper • Table naming stored in Zoo Keeper via id's • Follow threads of execution using tracing • Record time actions take place • Accumulo can be used with Squirrel server • As next slide shows • Future presentation will cover Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  6. Accumulo – with Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  7. Accumulo – Data Management Internal Data Management • Locality groups • Group columns within a single file • Smart compaction • Smaller files merged with larger using definable ratio until all files merged • Minor compaction • To avoid max files being reached in memory files merged with larger files • Loading user created jars • Load Jars from HDFS using VFS www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  8. Accumulo – Data Management On Demand Data Management • Compactions • Force tablets ( table partitions ) to compact to a single file • Tablet merging • Request tablet merging via shell • Table cloning • Clone a table from an existing one, reference data / config • Table import / export • Copy table / meta data to another cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  9. Accumulo – Screen Shot www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  10. Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems

More Related