1 / 26

Hortonworks: We Do Hadoop.

Hortonworks: We Do Hadoop. Our mission is to enable your Modern Data Architecture by D elivering Enterprise Apache Hadoop . January 2014. Our Mission:. Enable your Modern Data Architecture by Delivering Enterprise Apache Hadoop . Our Commitment

necia
Download Presentation

Hortonworks: We Do Hadoop.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hortonworks: We Do Hadoop. Our mission is to enable your Modern Data Architecture by Delivering Enterprise Apache Hadoop January 2014

  2. Our Mission: Enable your Modern Data Architecture by Delivering Enterprise Apache Hadoop Our Commitment Open LeadershipDrive innovation in the open exclusively via the Apache community-driven open source process Enterprise RigorEngineer, test and certify Apache Hadoop with the enterprise in mind Ecosystem EndorsementFocus on deep integration with existing data center technologies and skills Headquarters: Palo Alto, CA Employees: 300+ and growing Reseller Partners

  3. A Traditional Approach Under Pressure APPLICATIONS Custom Applications Packaged Applications Business Analytics 2.8 ZB in 2012 DATA SYSTEM REPOSITORIES 85% from New Data Types RDBMS EDW MPP 15x Machine Data by 2020 40 ZB by 2020 Source: IDC SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

  4. Emerging Modern Data Architecture APPLICATIONS Custom Applications Packaged Applications Business Analytics OPERATIONAL TOOLS DEV & DATA TOOLS MANAGE & MONITOR BUILD & TEST DATA SYSTEM REPOSITORIES RDBMS EDW MPP SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

  5. Drivers of Hadoop Adoption • New Business Applications From NEW types of Data (or existing types for longer)

  6. Most Common NEW TYPES OF DATA • SentimentUnderstand how your customers feel about your brand and products – right now • ClickstreamCapture and analyze website visitors’ data trails and optimize your website • Sensor/MachineDiscover patterns in data streaming automatically from remote sensors and machines • GeographicAnalyze location-based data to manage operations where they occur • Server LogsResearch logs to diagnose process failures and prevent security breaches • Unstructured (txt, video, pictures, etc..)Understand patterns in files across millions of web pages, emails, and documents Value + Keep existing data longer!

  7. 20 Business Applications of Hadoop

  8. Drivers of Hadoop Adoption A Modern Data ArchitectureComplement your existing data systems: the right workload in the right place • New Business Applications • Architectural

  9. Requirements for Enterprise Hadoop in the Modern Data Architecture

  10. Requirements for Enterprise Hadoop 1 Key ServicesPlatform, Operational and Data services essential for the enterprise OPERATIONAL SERVICES OPERATIONAL SERVICES DATASERVICES PIG HIVE & HCATALOG AMBARI Cluster Mgmt Dataset Mgmt Data Access Data Movement FLUME HBASE FALCON* SQOOP OOZIE Schedule 2 SkillsLeverage your existing skills: development, analytics, operations CORE SERVICES MAP TEZ Process REDUCE NFS YARN Resource Management WebHDFS HDFS Storage CORE SERVICES Data Security Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots KNOX* Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots 3 IntegrationInteroperable with existing data center investments

  11. HDP: A Complete Hadoop Distribution 1 Key ServicesPlatform, Operational and Data services essential for the enterprise OPERATIONAL SERVICES DATASERVICES OPERATIONAL SERVICES OPERATIONAL SERVICES DATASERVICES DATASERVICES PIG HIVE & HCATALOG AMBARI Cluster Mgmnt AMBARI Dataset Mgmnt FALCON Data Access PIG HIVE Data Movement FLUME FLUME HBASE FALCON* SQOOP SQOOP HBASE OOZIE Schedule OOZIE 2 LOAD & EXTRACT LOAD & EXTRACT SkillsLeverage your existing skills: development, analytics, operations CORE CORE SERVICES CORE SERVICES MAP TEZ Process MAP TEZ REDUCE REDUCE NFS NFS YARN YARN Resource Management WebHDFS WebHDFS HDFS Storage HDFS CORE SERVICES KNOX Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots KNOX* Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots 3 IntegrationInteroperable with existing data center investments HORTONWORKS DATA PLATFORM (HDP) HORTONWORKS DATA PLATFORM (HDP) HORTONWORKS DATA PLATFORM (HDP) OS/VM Cloud Appliance

  12. Hadoop 2: The Introduction of YARN Store all data in a single place, interact in multiple ways Single Use System Batch Apps Multi Use Data Platform Batch, Interactive, Online, Streaming, … 1st Gen of Hadoop HADOOP 2 Standard Query Processing Hive, Pig Online Data Processing HBase, Accumulo others… Real Time Stream Processing Storm MapReduce (cluster resource management & data processing) Batch MapReduce Interactive Tez Efficient Cluster Resource Management& Shared Services (YARN) HDFS (redundant, reliable storage) Redundant, Reliable Storage (HDFS)

  13. Apache Hadoop YARN The data operating system for Hadoop 2.0 • FlexibleEnables other purpose-built data processing models beyond MapReduce (batch), such as interactive and streaming EfficientDouble processing IN Hadoop on the same hardware while providing predictable performance & quality of service SharedProvides a stable, reliable, secure foundation and shared operational services across multiple workloads Data Processing Engines Run Natively INHadoop BATCH MapReduce OTHERS INTERACTIVE Tez ONLINE HBase, Accumulo STREAMING Storm IN-MEMORY Spark GRAPH Giraph SAS LASR, HPA YARN: Cluster Resource Management HDFS: Redundant, Reliable Storage

  14. Driving Our Innovation Through Apache Hortonworks mission is to power your modern data architecture by enabling Hadoopto be an enterprise data platform that deeply integrates with your data center technologies Total Net Lines Contributed to Apache Hadoop End Users 449,768lines 614,041 lines 63 total 147,933 lines 10 Others 21 LinkedIn: 3 IBM: 3 Facebook: 5 Yahoo: 10 Cloudera: 7 Total Number of Committers to Apache Hadoop

  15. Patterns for Hadoop Applications 1 Key ServicesPlatform, operational and data services essential for the enterprise DEVELOP COLLECT PROCESS BUILD 2 SkillsLeverage your existing skills: development, analytics, operations ANALYZE EXPLORE QUERY DELIVER OPERATE PROVISION MANAGE MONITOR 3 IntegrationInteroperable with existing data center investments

  16. Familiar and Existing Tools 1 Key ServicesPlatform, operational and data services essential for the enterprise DEVELOP COLLECT PROCESS BUILD 2 SkillsLeverage your existing skills: development, analytics, operations ANALYZE EXPLORE QUERY DELIVER BusinessObjects BI OPERATE PROVISION MANAGE MONITOR 3 IntegrationInteroperable with existing data center investments

  17. SQL Interactive Query & Apache Hive 1 • Apache Hive • The defacto standard for Hadoop SQL access • Used by your current data center partners • Built for batch AND interactive query Key ServicesPlatform, operational and data services essential for the enterprise 2 SQL SkillsLeverage your existing skills: development, analytics, operations • Stinger Initiative • Broad, community based effort to deliver the next generation of Apache Hive 3 IntegrationInteroperable with existing data center investments Speed Improve Hive query performance by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications against Hadoop

  18. Requirements for Enterprise Hadoop Integrate with Applications Business Intelligence, Developer IDEs, Data Integration Systems Data Systems & Storage, Systems Management Platforms Operating Systems, Virtualization, Cloud, Appliances APPLICATIONS Custom Applications Packaged Applications Business Analytics OPERATIONAL TOOLS DEV & DATA TOOLS MANAGE & MONITOR BUILD & TEST DATA SYSTEM REPOSITORIES RDBMS EDW MPP 3 IntegrationInteroperable with existing data center investments SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

  19. Broad Ecosystem Integration APPLICATIONS DEV & DATATOOLS OPERATIONAL TOOLS DATA SYSTEM RDBMS EDW MPP HANA BusinessObjects BI INFRASTRUCTURE SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

  20. Relying on Hortonworks… Teradata Portfolio for Hadoop • Seamless data access between Teradata and Hadoop (SQL-H) • Simple management & monitoring with Viewpoint integration • Flexible deployment options HDInsight & HDP for Windows • Only Hadoop Distribution for Windows Azure & Windows Server • Native integration with SQL Server, Excel, and System Center • Extends Hadoop to .NET community Instant Access + Infinite Scale • SAP can assure their customers they are deploying an SAP HANA + Hadoop architecture fully supported by SAP • Enables analytics apps (BOBJ) to interact with Hadoop Complete Portfolio for Hadoop UDA Diagram Appliances

  21. Our Approach Community Driven Enterprise Apache Hadoop

  22. HDP 2.0: Enterprise Hadoop Platform Hortonworks Data Platform (HDP) • The ONLY 100% open source and most current platform • Integrates full range of enterprise-ready services • Certified and tested at scale • Engineered for deep ecosystem interoperability OPERATIONAL SERVICES DATASERVICES OPERATIONAL SERVICES OPERATIONAL SERVICES DATASERVICES DATASERVICES PIG HIVE & HCATALOG AMBARI Cluster Mgmnt AMBARI Dataset Mgmnt FALCON Data Access PIG HIVE Data Movement FLUME FLUME HBASE FALCON* SQOOP SQOOP HBASE OOZIE Schedule OOZIE LOAD & EXTRACT LOAD & EXTRACT CORE CORE SERVICES CORE SERVICES MAP TEZ Process MAP TEZ REDUCE REDUCE NFS NFS YARN YARN Resource Management WebHDFS WebHDFS HDFS Storage HDFS CORE SERVICES KNOX* Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots KNOX* Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) HORTONWORKS DATA PLATFORM (HDP) HORTONWORKS DATA PLATFORM (HDP) OS/VM Cloud Appliance

  23. HDP 2.0: Reliable, Consistent & Current HDP certifies most recent & stable community innovation Hadoop Pig HCatalog Hive HBase Sqoop Flume Oozie Zookeeper Mahout Ambari 1.4.1 OCT 0.12.0 HDP 2.0 2013 0.96.0 0.12.0 1.4.4 2.2.0 0.11.0 0.8.0 4.0.0 1.31 1.2.3 0.94.6 0.11 1.4.3 3.4.5 May HDP 1.3 3.3.2 2013 0.5.0 0.7.0 0.10.0 1.2.0 0.94.2 1.1.2 3.2.0 0.10.1 1.4.2 FEB HDP 1.2 1.30 0.9.0 3.3.4 2013 HMC1.1 0.92.1 0.9.2 3.1.3 0.4.0 1.4.1 SEPT 1.0.3 HDP 1.1 HMC1 2012 JUNE HDP 1.0 2012 Hortonworks Data Platform

  24. Flexible Support Subscription Programs Leverage Hortonworks Expertise Subscription and Support delivered and backed by Hadoop experts

  25. Transferring Hadoop Expertise Apache Hadoop Training & Certification • World class training programs • Designed to help you learn fast • Role-based hands on classes with 50% lab time • Hadoop Certification demonstrates expertise in Development & Administration • Expert consulting services • Programs designed to transfer knowledge • Industry leading Hadoop Sandbox • Free download • Fastest way to learn Apache Hadoop • Personal, portable Hadoop environment

  26. Hortonworks: The Value of “Open” for You Validate & Try Download the Hortonworks Sandbox Learn Hadoop using the technical tutorials Investigate a business case using the step-by-step business cases scenarios Validate YOUR business case using your data in the sandbox Connect With the Hadoop CommunityWe employ a large number of Apache project committers & innovators so that you are represented in the open source community Avoid Vendor Lock-InHortonworks Data Platform remain as close to the open source trunk as possible and is developed 100% in the open so you are never locked in The Partners you Rely On, Rely On Hortonworks We work with partners to deeply integrate Hadoop with data center technologies so you can leverage existing skills and investments Certified for the EnterpriseWe engineer, test and certify the Hortonworks Data Platform at scale to ensure reliability and stability you require for enterprise use Support from the ExpertsWe provide the highest quality of support for deploying at scale. You are supported by hundreds of years of Hadoop experience Engage Execute a Business Case Discovery Workshop with our architects Build a business case for Hadoop today

More Related