1 / 12

Apache Trafodian

This presentation gives an overview of the Apache Trafodian project. It explains Trafodian architecture in relation to Hadoop/HBase and it's process structure. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/

semtechs
Download Presentation

Apache Trafodian

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What Is Apache Trafodian ? ● A relational database management system ( RDBMS ) ● Open sourced / Apache 2.0 license ● Running on Apache Hadoop ● Written in Java and C++ ● Originally developed by HP Labs ● Uses Apache HBase

  2. Hadoop Based Eco-System

  3. Trafodian Overview ● Uses Hadoop Hbase / HDFS for storage ● ANSI SQL support ● ACID support (Atomicity, Consistency, Isolation, Durability) ● Supports big data sets ● Parallel processing in terms of – Query optimization – Query execution

  4. Trafodian Process Architecture

  5. Trafodian Process Architecture ● CMP ~ Second instance of the compiler code ● DCS = Database connectivity services ● DTM = Database transaction manager ● ISV = Independent software vendor ● ESP = Executor server process ● JDBC = Java database connectivity ● ODBC = Open database connectivity

  6. Trafodian Process Architecture ● Clients connect via JDBC or ODBC ● DCS Master / Server manage connection ● Master Executor processes SQL ● DTM manages transactions ● CMP manages DDL/utilities requiring compiler code ● ESP's manage execution time parallelism ● Storage engine uses HBase and Hadoop

  7. Trafodian Logs ● Uses log4j and log4cpp ● Log level set to ERROR by default ● Master executor logs stored on the local node ● All logs can be searched via SQL UDF – select * from udf(event_log_reader( [options] )); ● Searches all node log files ● Returns time stamped log data

  8. Trafodian Logs ● Returned columns: – log_ts timestamp(6), – severity char(10 bytes) character set utf8, – component char(24 bytes) character set utf8, – node_number integer, – cpu integer, – pin integer, – process_name char(12 bytes) character set utf8, – sql_code integer, – query_id varchar(200 bytes) character set utf8, – message varchar(4000 bytes) character set utf8

  9. Trafodian Repository ● Repository contained in REPOS schema – tables ● METRIC_QUERY_AGGR_TABLE – Statistics for short running queries (aggregated) ● METRIC_QUERY_TABLE – Query statistics information ● METRIC_SESSION_TABLE – ODBC and JDBC session statistics ● METRIC_TEXT_TABLE – Reserved for future use

  10. Trafodian Configuration ● Files stored in install conf directory – dcs-site.xml – site specific information – dcs-default.xml – default configuration – dcs-env.sh – environment specific – log4j.properties – log control – master – identifies master host – backup-masters – identifies master backup hosts – servers – identifies server hosts

  11. Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –

  12. Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

More Related