cuda performance study on hadoop mapreduce clusters
Download
Skip this Video
Download Presentation
CUDA Performance Study on Hadoop MapReduce Clusters

Loading in 2 Seconds...

play fullscreen
1 / 20

CUDA Performance Study on Hadoop MapReduce Clusters - PowerPoint PPT Presentation


  • 134 Views
  • Uploaded on

CUDA Performance Study on Hadoop MapReduce Clusters. CSE 930 Advanced Computer Architecture @ Fall 2010. Chen He Peng Du [email protected] [email protected] University of Nebraska-Lincoln. Overview. Introduction Methodology Evaluation Conclusions Future work. Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CUDA Performance Study on Hadoop MapReduce Clusters ' - fadhila


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cuda performance study on hadoop mapreduce clusters

CUDA Performance Study on Hadoop MapReduce Clusters

CSE 930 Advanced Computer Architecture @ Fall 2010

Chen He Peng Du

[email protected]@cse.unl.edu

University of Nebraska-Lincoln

overview
Overview
  • Introduction
  • Methodology
  • Evaluation
  • Conclusions
  • Future work
introduction
Introduction
  • Hadoop MapReduce
introduction1
Introduction

CPU+GPU Architecture

introduction2
Introduction
  • Questions
    • Can we introduce CUDA into Hadoop MapReduce Clusters?
      • Mechanism and implementation
    • Is this reasonable?
      • Effects and Costs
methodology
Methodology
  • Question-1:Can we introduce CUDA into Hadoop ?
methodology1
Methodology
  • Test cases
    • SDK programs
      • Data intensive: Matrix Multiplication
      • Computation intensive: Monte Carlo
    • MDMR
      • Pure Java program
      • Introduce Jcuda
methodology2
Methodology
  • Port CUDA programs onto Hadoop
    • GPU (CUDA-C) vs CPU (C)
    • Approach
      • MapRed (processHadoopData & cudaCompute)
      • Main (Hadoop Pipes)
      • Scripts (runbase.sh, run-<prog>-CPU/GPU.sh)
      • Input data generators
methodology3
Methodology

Hadoop DFS

.c

.c

data

Input Generator

generates

void generate(..)

Hadoop-enabled MonteCarlo

CUDA MonteCarlo

MonteCarlo.cpp

void processHadoopData(..)

void cudaCompute(..)

extracted

.c

.c

.c

extracted

.cu

MapRed.cpp

.cu

.cu

class Mapper

class Reducer

methodology4
Methodology
  • MDMR
    • Time Complexity by using CPU
    • We can simply employ GPU to parallel the n-squre portion to reduce the time complexity to linear
evaluation
Evaluation

Environment

Head: 2xAMD 2.2GHz, 4GB DDR400 RAM, 800GB HD

3 PCs (AMD 2.3G CPU, 2G DDR2-667 RAM, 400GB HD, 1Gbps Ethernet)

XFX 9400GT 64bit 512MB DDR3 ($20)

CUDA 3.2 Toolkit

Hadoop 0.20.3

ServerTech CWG-CDU power distribution unit (for the power consumption monitoring)

Factors

Speedup

Power consumption

Cost

evaluation1
Evaluation

Matrix Multiplication (Execution time)

evaluation2
Evaluation

Matrix Multiplication (Power consumption)

evaluation5
Evaluation
  • MDMR
    • Execution time
evaluation6
Evaluation
  • MDMR
    • Power consumption
conclusions
Conclusions
  • Introduced GPU into MapReduce cluster and obtained up to 20 times speedup.
  • Reduced up to 19/20 power consumption with the current preliminary solution and work load.
  • With as little as $20 per node, compared with upgrading CPUs and adding more nodes, deploying GPU on Hadoop has high cost-to-benefit ratio.
  • Provided practical implementations for people wanting to construct MapReduce clusters with GPUs.
future work
Future Work
  • Port more CUDA programs onto Hadoop.
  • Incorporate reducers into the experiments
  • Support heterogeneous clusters which mixed GPU-nodes and non-GPU nodes.
ad