1 / 17

Apache Samoa ML

This presentation gives an overview of the Apache Samoa ML project. It explains Apache Samoa ML in terms of it's architecture, the way that it abstracts implementation via its API and the stream processing systems that it supports. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/

semtechs
Download Presentation

Apache Samoa ML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What Is Apache Samoa ? ● An Apache incubator project ● A machine learning framework ● A distributed scaleable system ● Deploys to existing Apache systems – Storm, S4, Samza, AVRO – Deploy a Samoa algorithm these systems – Samoa abstracts implementation via API ● Designed for stream processing ● Offers a range of ML algorithms

  2. Samoa Terms Samoa terms that might be of use PE Processing element PI Processing item EPI Entrance processing item Spout A storm term for a data source Bolt A storm term for a data join element ML Machine learning

  3. Samoa Algorithms ● Samoa supported algorithms – Prequential Evaluation Task – Vertical Hoeffding Tree Classifier – Adaptive Model Rules Regressor – Bagging and Boosting – Distributed Stream Clustering – Distributed Stream Frequent Itemset Mining – SAMOA for MOA users

  4. Samoa Architecture

  5. Samoa Architecture

  6. Samoa Architecture ● The aim of Samoa is to provide implementation abstraction ● For stream processing algorithms ● Written using it's API ● Against the stream processing systems that it supports ● So for instance, write an algorithm once and ● Deploy to S4 and Storm ● The deployment process creates a platform jar ● That you can deploy to the specific platform

  7. Samoa Topology

  8. Samoa Topology ● Samoa provides a simple topology for stream processing ● This includes the elements – Processor – Content Event – Stream – Task – Topology Builder – Learner – Processing Item

  9. Samoa Processor ● Processor is the basic logical processing unit ● All logic is written in the processor ● In Samoa, a Processor is an interface ● Users can implement this interface – To build their own custom class ● A processor in a Samoa topology can be – A processor in the topology – An entrance processor which sources the stream

  10. Samoa Content Event ● A message or an event is called Content Event in Samoa ● It is an event which contains content which ● Needs to be processed by the processors ● ContentEvent has been implemented as an interface in Samoa ● Users need to implement ContentEvent interface ● To create their custom message classes

  11. Samoa Stream ● A stream is a physical unit of SAMOA topology ● Which connects different Processors with each other ● Stream is also created by a TopologyBuilder – Just like a Processor ● A stream can have a single source but many destinations ● A Processor which is the source of a stream owns the stream

  12. Samoa Task ● Task is similar to a job in Hadoop ● Task is an execution entity ● A topology must be defined inside a task ● Samoa can only execute classes ● That implement Task interface

  13. Samoa Topology Builder ● TopologyBuilder is a builder class ● Which builds physical units of the topology ● And assemble them together ● Each topology has a name ● An example topology might have – An EntrancePI – Some PI's – Some streams

  14. Samoa Learner ● Learners are sub-topologies ● Use init() function to – Add streams – Add processors – Specify connections to the topology ● Use getInputProcessor() function to – Add processor that will manage the input stream ●Use getResultStream() function to – Specify what is going to be the output stream

  15. Samoa Processing Item ● Processing Item is a hidden physical unit of the topology ● Is just a wrapper of Processor ● It is used internally ● Is not accessible from the API ● Connects the Processor to the other processors in the topology – Simple Processing Item (PI) – Entrance Processing Item (EntrancePI)

  16. Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –

  17. Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

More Related