1 / 4

How Hadoop Works

Hadoop is an open source distributed processing framework that manages data processing and storage for Big Data application running in clustered systems. It also includes predictive analytics, data mining, and machine learning applications.

Download Presentation

How Hadoop Works

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. www.prwatech.in How Hadoop Works First of all its not big data that handles a large amount of data. Big data itself consists of a various type of data which is needed to be handled. So to handle this we use a framework called Hadoop. Basically, Hadoop is an open source distributed processing framework that manages data processing and storage for Big Data application running in clustered systems. It also includes predictive analytics, data mining, and machine learning applications. Hadoop can handle various forms of structured and unstructured data, giving users more flexibility for collecting, processing and relational databases and data warehouses provided. Hadoop Has Two Main Systems: Hadoop Distributed File System (HDFS): The storage system for Hadoop spread out over multiple machines as a means to reduce cost and increase reliability. analyzing data than Address: No. 14, 29th Main, 2nd Cross, V.P. road BTM-1st Stage, Behind AXA building, Land Mark : Vijaya Bank ATM Bangalore – 560 068, India

  2. www.prwatech.in MapReduce Engine: The algorithm that filters, sorts and then uses the database input in some way. How HDFS works? HDFS supports the rapid transfer of data between compute nodes. At its outset, it was closely coupled with MapReduce, a programmatic framework for data processing. When HDFS takes in data, it breaks the information down into separate blocks and distributes them to different nodes in a cluster, thus enabling highly efficient parallel processing. Moreover, the Hadoop Distributed File System is specially designed to be highly fault-tolerant. The file system replicates, or copies, each piece of data multiple times and distributes individual nodes, placing at least one copy on a different server rack than the others. As a result, the data on nodes that crash can be found elsewhere within a cluster. This ensures that processing can continue while data is recovered. the copies to Address: No. 14, 29th Main, 2nd Cross, V.P. road BTM-1st Stage, Behind AXA building, Land Mark : Vijaya Bank ATM Bangalore – 560 068, India

  3. www.prwatech.in HDFS uses master/slave architecture. In its initial incarnation, each Hadoop Cluster consisted of a single NameNode that managed file system operations and supporting DataNodes that managed data storage on individual compute nodes. The HDFS elements combine to support applications with large data sets. How MapReduce Works? The MapReduce algorithm contains two important tasks, namely Map and Reduce. 1.The Map task takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (key-value pairs). 2.The Reduce task takes the output from the Map as an input and combines those data tuples (key-value pairs) into a smaller set of tuples. Address: No. 14, 29th Main, 2nd Cross, V.P. road BTM-1st Stage, Behind AXA building, Land Mark : Vijaya Bank ATM Bangalore – 560 068, India

More Related