1 / 11

Apache Kafka

This presentation gives an overview of the Apache Kafka project. It covers areas like producer, consumer, topic, partitions, API's, architecture and usage. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/<br><br>Music by <br><br>"Little Planet", composed and performed by Bensound from http://www.bensound.com/

semtechs
Download Presentation

Apache Kafka

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What Is Apache Kafka ? ● A stream processing platform ● Open source / Apache 2.0 license ● Written in Java and Scala ● A publish/subscribe system for record streams ● Scaleable / fault tolerant ● Topic based partition FIFO queues

  2. How Does Kafka Work ? ● Kafka runs as a cluster of servers ● Stores records in topics ● Topics are partitioned into queues ● Partitions are stored across cluster ● Consumers organised into groups ● Stream processors transform records ● Reusable connectors process queues – For instance database connectors

  3. Kafka API'S ● Producer API – Allows applications to publish to topics ● Consumer API –Applications subscribe to topics / processdata streams ● Streams API – Applications acts as stream processor, transforming stream ● Connector API – Build reusable producers / consumers – I.E. RDBMS connectors/producers/consumers ● Admin API – For topic and broker management

  4. Kafka Logical Architecture

  5. Kafka Topic Queue Offsets

  6. Kafka Topic Queue Offsets ● Records published to Topics ● Topics are multi subscriber ● Topics contain partition queues ● A partition queue contains an sequence of records ● Each record has a queue offset ( position ) ● Consumers use the offset to read records ● Queue record retention is configurable

  7. Kafka Producer Consumer

  8. Kafka Producer Consumer ● Producers write to partitions i.e. Producer1 → P0 ● Producers responsible for record → partition mapping ● Kafka only guarantees order with a partition ● Kafka cluster contains <n> servers ● Partitions mapped to servers ● Consumers members of consumer groups ● Each consumer must maintain it's partition read offset

  9. Kafka's Stack Role ● A low latency messaging system – Records load balanced across partitions ● As a storage system – Using local file system storage – Scales horizontally in terms of performance ● As a stream processing system – Using stream API to transform data ● Data replication provides fault tolerance

  10. Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –

  11. Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

More Related