110 likes | 122 Views
This presentation gives an overview of the Apache Phoenix project. It explains Phoenix in terms of its architecture, environment, ETL, SQL, UDF's and transactions. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/
E N D
What Is Apache Phoenix ? ● Massively parallel, relational database engine ● Supports OLTP for Hadoop ● Uses Apache HBase as its backing store ● Open source / Apache 2.0 license ● Written in Java , SQL ● ACID (atomicity, consistency, isolation, durability) – Via Apache Tephra integration
Phoenix SQL Support ● Accepts SQL queries ● Compiles them to HBase scans ● Orchestrates running of scans ● Produces regular JDBC result sets ● Creates performance gains by using – HBase API/coprocessors/custom filters ● Results in query response times – Milliseconds for small queries – Seconds for tens of millions of rows
Phoenix SQL Support ● See phoenix.apache.org for full syntax support
Phoenix Bulk Loading ● Bulk load data via ● Single-threaded for CSV via psql i.e. – bin/psql.py -t EXAMPLE localhost data.csv – Load for EXAMPLE table – For HBase on local machine ● MapReduce-based for CSV and JSON – See next slide
Phoenix Bulk Loading ●Bulk load example for MapReduce – For CSV and JSON loads – Using Phoenix MapReduce library – Against the EXAMPLE table
Phoenix User-defined functions(UDFs) ● Create temporary/permanent UDF's – Temporary for session only ● Use UDF's in SQL and Indexes ● Permanent UDF's stored in SYSTEM.FUNCTION ● Tenant specific UDF usage supported ● UDF jar files must be placed on HDFS ● UDF jar updates not currently possible – (without cluster bounce)
Phoenix Transactions ● Cross row/table/ACID support using Apache Tephra ● Transactional functionality currently beta ● Enable transactions and snapshot dir in hbase-site.xml ● Also set a transational timeout value ● Start Tephra ● Create tables with flag TRANSACTIONAL=true ● Then transactions act as follows – Start with statement against table – End with commit or rollback
Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –
Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration