tuplejump

` tuplejump The data engineering platform

tuplejump A startup with a vision to simplify data engineering and empower the next generation of data powered miracles! Rohit Founder and CEO Satya Founder and CTO

What we do? • Tuplejump Platform provides ready to use, out of the box, all integrated end-to-end data pipeline components to bring your idea to life fast! • Most startups spend a lot of time studying and integrating various OSS. We have done this for you and assembled a system incorporating best of the breed systems. • Our service engineers can assist you or develop your PoCs to entire solutions in record time.

The Data Pipeline VISUALIZE PREDICT COLLECT TRANSFORM STORE EXPLORE OpsCenter

The Tuplejump Platform | COLLECT Hydra The tentacled framework to gather high volume and velocity data from push (devices, page alerts, forms, etc) and pull (web scraping, blogs, social networks, etc.) powered by Akka, reacting on demands to events and streaming to Spark to batch process.

The Tuplejump Platform | TRANSFORM Spark + Calliope Using the friendly Spark API with added features to easily consume or load data from and to Cassandra powered storage. Transform structured and unstructured data and join other most simple data sets using drag and drop. Join delta transformations on real time feeds with existing data using Spark streaming,

The Tuplejump Platform | STORE DStore - Cassandra++ Cassandra, enriched with our custom components to provide an single storage mechanism for Files, (un)structured data, generic data formats like XML and JSON, etc. Stargate Stargate, a lucene powered indexing mechanism built right into C* to allow for advanced indexing and searching of data SnackFS SnackFS provides an HDFS compatible fat driver distributed file system over Cassandra.

The Tuplejump Platform | EXPLORE Shark + Calliope Shark Analytical engine shines in exploring structured and unstructured data sets having large amounts of data . With Calliope, you can have the most comprehensive reporting on data from Cassandra in seconds and minutes not hours. Using Stargate indexes you can filter a lot of data in Cassandra saving those agonizing hours of batch jobs. UberCube Our patent pending Ubercube (™) technology is an distributed OLAP cube engine designed from ground up for interactive exploration over very large datasets. .

The Tuplejump Platform | PREDICT MinerBot Building on Spark's ML frameworl. EA and ANN/DL frameworks to take ML to the next level. Drag and drop Machine learning soon!

The Tuplejump Platform | VISUALIZE Pissaro A modern, game changing data frontend providing highly interactive and reactive visualization frontend. Not just reports!

The Tuplejump Platform | OpsCenter OpsCenter Deployment, monitoring and management framework built specifically targeting deploying, maintaining and scaling our platform without touching your server. Click to cluster One click deployment o take your application from development to cluster. BigDataPaaS Coming soon is a PaaS, so you focus on your idea and let us worry about the rest.

Tuplejump Advantage • All the advantages of Spark + All the advantages of Cassandra + Much more! • Over 500x (much more in case of filtered data) faster than traditional Hadoop solutions • Shark + C* provide for superfast ad hoc querying. • UberCube empowers sub-millisecond responses on very large cubes • MinerBot provides ready to use ML Algos, plus a possibility of much more complex algos and mechanisms than just map reduce. • Ready to use, no integration required • Easy to develop, deploy, monitor and scale

Case Study I - IoT

Case Study I - IoT • Hydra was designed for IoT in first place. Supports MQTT for messaging from and to devices/sensors and communication between devices. • Use message processing to raise alerts • Use batch processing for advanced data analytics • DStore provides a highly scalable write optimized distributed storage for events and messages. • MinerBot powers anomaly detection and automation on event analysis and patterns • Build multidimensional analytics cube on the event features with UberCube • Visualize and understand the events in charts with Pissaro

Case Study II - Advertising Ads

Case Study II - Advertising • Hydra empowers high volume/velocity data collection to gather page clicks, user events, user behaviuor, etc. • Event Processing to trigger/handle RTB • MinerBot to optimize ad-user matching based on previous success/failure records • Pissaro to empower the Advertiser dashboard and reports

Lets talk!

tuplejump

tuplejump

Presentation Transcript