Web search for a planet the google cluster architecture
Sponsored Links
This presentation is the property of its rightful owner.
1 / 33

Web Search for a Planet: The Google Cluster Architecture PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on
  • Presentation posted in: General

Web Search for a Planet: The Google Cluster Architecture. Eugenio De Hoyos 6175 Computer Science Seminar October 4, 2011. introduction. introduction. “. … a single query on Google reads h undreds of megabytes of data and c onsumes tens of billions of CPU cycles…. ”. IO.

Download Presentation

Web Search for a Planet: The Google Cluster Architecture

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Web Search for a Planet:The Google ClusterArchitecture

Eugenio De Hoyos

6175 Computer Science Seminar

October 4, 2011


introduction


introduction

… a single query on Google reads

hundreds of megabytes of data and

consumes tens of billions of CPU cycles…

IO

500 MB @ 20 MB/s → 25 sec

CPU

10x109 cycles @ 2 GHz → 5 sec


introduction

… a single query on Google reads

hundreds of megabytes of data and

consumes tens of billions of CPU cycles…

IO

500 MB @ 20 MB/s → 25 sec

CPU

10x109 cycles @ 2 GHz → 5 sec


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


a single query

http://www.googlefalle.com


a single query

Google Web Server

Google Web Server

Google Web Server

Google Web Server

Google Web Server

Hardware

Load Balancer

Google Web Server

Google Web Server


Google Web Server

Google Web Server

Google Web Server

4

3

2

1

IndexServers

DocumentServers

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Shard

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC

PC


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


philosophy

Service C

Service B

Service A


philosophy


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


the power problem

RAM/BOARD

HD


A Google data center, circa 2000. Note the fan on the floor to cool servers.

(Credit: Stephen Shankland-CNET News.com/Jeff Dean-Google)


their observation

Equipment

Cost

Power &

Cooling

Scale


are their numbers right?

Min. Amortization

Requires

$ 1,500

Operating Costs

Min. Cost

Requires

$ 20,000

Amortization

Cost of inefficiency


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


hardware

index server

RAM

CPU

Hard Drive


hardware

0

8

6

7

9

5

3

1

2

4

1 Clock Cycle

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

Short

Pipeline

Pentium III

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

6

7

8

9

0

5

9

1

2

3

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

8

6

7

9

0

5

8

9

3

1

2

4

6

7

8

9

0

5

7

8

9

1

2

3

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

6

7

8

0

5

6

7

8

9

1

2

3

4

Long

Pipeline

6

7

0

5

6

7

8

9

1

2

3

4

5

Pentium IV

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

6

0

5

8

6

7

9

3

1

2

4

5

4

0

5

6

7

8

9

1

2

3

4

5

3

4

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

3

1

2

4

5

3

2

4

0

8

6

7

9

3

1

2

5

3

1

2

4

0

8

6

7

9

1

2

5

3

1

2

4


hardware

0

8

6

7

9

5

3

1

2

4

1 Clock Cycle

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

Short

Pipeline

Pentium III

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

6

7

8

9

0

5

9

1

2

3

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

0

8

6

7

9

0

5

8

9

3

1

2

4

6

7

8

9

0

5

7

8

9

1

2

3

4

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

6

7

8

0

5

6

7

8

9

1

2

3

4

Long

Pipeline

6

7

0

5

6

7

8

9

1

2

3

4

5

Pentium IV

0

8

6

7

9

5

3

1

2

4

5

3

1

2

4

6

0

5

8

6

7

9

3

1

2

4

5

4

0

5

6

7

8

9

1

2

3

4

5

3

4

0

8

6

7

9

5

3

1

2

4

0

8

6

7

9

3

1

2

4

5

3

2

4

0

8

6

7

9

3

1

2

5

3

1

2

4

0

8

6

7

9

1

2

5

3

1

2

4


hardware

instruction level parallelism

5

5

3

3

1

1

2

2

4

4

thread level parallelism

5

5

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4


hardware

simultaneous multithreading (SMT)

5

5

5

5

3

3

3

3

1

1

1

1

2

2

2

2

4

4

4

4

5

5

5

5

3

3

3

3

1

1

1

1

2

2

2

2

4

4

4

4

5

5

5

5

3

3

3

3

1

1

1

1

2

2

2

2

4

4

4

4

5

5

5

5

3

3

3

3

1

1

1

1

2

2

2

2

4

4

4

4

CPU

L1

5

5

5

5

3

3

3

3

1

1

1

1

2

2

2

2

4

4

4

4

L2


hardware

chip multiprocessor (CMP)

5

5

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4

L1

5

5

3

3

1

1

2

2

4

4

5

5

CPU

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4

5

5

3

3

1

1

2

2

4

4

L2

5

5

3

3

1

1

2

2

4

4

CPU

5

5

3

3

1

1

2

2

4

4

L1


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


memory & scalability

Unpredictable memory access

Large cache lines prefetch helps

RAM

line length

Cache

CPU

cache length

Memory bandwith

OK


outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion


conclusion

Cluster architecture is ideal and least expensive

Maximize throughput

Software Reliability


conclusion

Service C

Service B

Service A


a discussion question…

HDMI Monitor

USB Keyboard

700 MHz ARM 11

128 MB RAM

Open GL ES 2.0 1080p

-- David Braben, UK game developer


questions?


  • Login