scaleability scale up and scale out
Download
Skip this Video
Download Presentation
Scaleability Scale Up and Scale Out

Loading in 2 Seconds...

play fullscreen
1 / 23

Scaleability Scale Up and Scale Out - PowerPoint PPT Presentation


  • 272 Views
  • Uploaded on

Scaleability Scale Up and Scale Out SMP Super Server Departmental Server Personal System Grow Up with SMP 4xP6 is now standard Grow Out with Cluster Cluster has inexpensive parts Cluster of PCs Thesis Many little beat few big 3 1 MM 10 nano-second ram 10 microsecond ram

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Scaleability Scale Up and Scale Out' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
scaleability scale up and scale out
ScaleabilityScale Up and Scale Out

SMP

Super Server

Departmental

Server

Personal

System

Grow Up with SMP

4xP6 is now standard

Grow Out with Cluster

Cluster has inexpensive parts

Cluster

of PCs

thesis many little beat few big
ThesisMany little beat few big

3

1 MM

10 nano-second ram

10 microsecond ram

10 millisecond disc

10 second tape archive

$1 million

$10 K

$100 K

Pico Processor

Nano

Micro

10 pico-second ram

1 MB

Mini

Mainframe

10

0

MB

1

0 GB

1

TB

1

00 TB

1.8"

2.5"

3.5"

5.25"

1 M SPECmarks, 1TFLOP

106 clocks to bulk ram

Event-horizon on chip

VM reincarnated

Multi-program cache,

On-Chip SMP

9"

14"

  • Smoking, hairy golf ball
  • How to connect the many little parts?
  • How to program the many little parts?
  • Fault tolerance & Management?

Gray - Microsoft @ LANL 12/17/98

4 b pc s 1 bips 1gb dram 10 gb disk 1 gbps net b g the bricks of cyberspace
4 B PC’s (1 Bips, .1GB dram, 10 GB disk 1 Gbps Net, B=G)The Bricks of Cyberspace
  • Cost 1,000 $
  • Come with
    • NT
    • DBMS
    • High speed Net
    • System management
    • GUI / OOUI
    • Tools
  • Compatible with everyone else
  • CyberBricks

Gray - Microsoft @ LANL 12/17/98

computers shrink to a point
Kilo

Mega

Giga

Tera

Peta

Exa

Zetta

Yotta

Computers shrink to a point
  • Disks 100x in 10 years 2 TB 3.5” drive
  • Shrink to 1” is 200GB
  • Disk is super computer!
  • This is already true of printers and “terminals”

Gray - Microsoft @ LANL 12/17/98

microsoft com 150x4 nodes a crowd
Microsoft.com: ~150x4 nodes: a crowd

Building 11

Staging Servers

(7)

Ave CFG:

4xP6,

Internal WWW

Ave CFG:

4xP5,

European Data Center

premium.microsoft.com

IDC Staging Servers

512 RAM,

www.microsoft.com

30 GB HD

(1)

MOSWest

(3)

Ave CFG:

4xP6,

Ave CFG:

4xP6,

512 RAM,

FTP Servers

512 RAM,

SQLNet

30 GB HD

Ave CFG:

4xP5,

SQL SERVERS

50 GB HD

Feeder LAN

512 RAM,

SQL Consolidators

(2)

Router

Download

30 GB HD

DMZ Staging Servers

Ave CFG:

Replication

4xP6,

Ave CFG:

4xP6,

512 RAM,

FTP

Router

1 GB RAM,

Live SQL Servers

160 GB HD

Download Server

160 GB HD

SQL Reporting

Ave Cost:

$83K

Ave CFG:

4xP6,

(1)

MOSWest

Switched

Ave CFG:

FY98 Fcst:

4xP6,

2

512 RAM,

Live SQL Server

Ave CFG:

Admin LAN

4xP6,

Ethernet

512 RAM,

160 GB HD

512 RAM,

160 GB HD

Ave Cost:

$83K

50 GB HD

FY98 Fcst:

12

search.microsoft.com

msid.msn.com

(1)

msid.msn.com

register.microsoft.com

www.microsoft.com

(1)

(1)

www.microsoft.com

(2)

(4)

Ave CFG:

4xP6,

Router

(4)

512 RAM,

search.microsoft.com

Ave CFG:

4xP6,

30 GB HD

Japan Data Center

(3)

512 RAM,

SQL SERVERS

www.microsoft.com

50 GB HD

Ave CFG:

premium.microsoft.com

4xP6,

(2)

(3)

512 RAM,

Ave CFG:

4xP6,

(1)

30 GB HD

home.microsoft.com

512 RAM,

Ave CFG:

4xP6,

home.microsoft.com

Ave CFG:

4xP6,

Ave Cost:

$28K

160 GB HD

FDDI Ring

512 RAM,

(3)

512 RAM,

FY98 Fcst:

(4)

7

(MIS2)

50 GB HD

premium.microsoft.com

30 GB HD

Ave CFG:

4xP6

(2)

msid.msn.com

512 RAM

Ave CFG:

4xP6,

activex.microsoft.com

28 GB HD

512 RAM,

(1)

(2)

FDDI Ring

Ave CFG:

4xP6,

30 GB HD

Switched

(MIS1)

512 RAM,

Ave CFG:

4xP6,

Ethernet

30 GB HD

256 RAM,

30 GB HD

FTP

Ave Cost:

$25K

cdm.microsoft.com

Download Server

Ave CFG:

FY98 Fcst:

4xP5,

2

(1)

256 RAM,

Router

(1)

HTTP

search.microsoft.com

12 GB HD

Download Servers

(2)

(2)

Router

Router

Internet

msid.msn.com

Router

(1)

2

Primary

2

Router

Gigaswitch

OC3

Ethernet

premium.microsoft.com

(100Mb/Sec Each)

Internet

(100 Mb/Sec Each)

Router

(1)

www.microsoft.com

Router

(3)

Secondary

Gigaswitch

13

Router

DS3

Router

FTP.microsoft.com

(45 Mb/Sec Each)

(3)

FDDI Ring

Ave CFG:

4xP5,

home.microsoft.com

(MIS3)

www.microsoft.com

msid.msn.com

512 RAM,

(2)

30 GB HD

(5)

(1)

Internet

register.microsoft.com

Ave CFG:

4xP5,

FDDI Ring

(2)

256 RAM,

(MIS4)

20 GB HD

register.microsoft.com

home.microsoft.com

support.microsoft.com

(1)

(5)

register.msn.com

(2)

(2)

Ave CFG:

4xP6,

support.microsoft.com

512 RAM,

search.microsoft.com

(1)

30 GB HD

(3)

Gray - Microsoft @ LANL 12/17/98

hotmail 400 computers crowd
HotMail: ~400 Computers Crowd

Gray - Microsoft @ LANL 12/17/98

db clusters crowds
DB Clusters (crowds)
  • 16-node Cluster
    • 64 cpus
    • 2 TB of disk
    • Decision support
  • 45-node Cluster
    • 140 cpus
    • 14 GB DRAM
    • 4 TB RAID disk
    • OLTP (Debit Credit)
      • 1 B tpd (14 k tps)

Gray - Microsoft @ LANL 12/17/98

slide8
Windows NT Versus UNIXBest Results on an SMP: SemiLog plot shows 3x (2 year) lead by UNIX Does not show Oracle/Alpha Cluster at 100,000 tpmCAll these numbers are off-scale huge (20,000 active users?)

Gray - Microsoft @ LANL 12/17/98

bottleneck analysis
Bottleneck Analysis
  • Drawn to linear scale

Theoretical

Bus Bandwidth

422MBps = 66 Mhz x 64 bits

MemoryRead/Write

~150 MBps

MemCopy

~50 MBps

Disk R/W

~9MBps

Gray - Microsoft @ LANL 12/17/98

bottleneck analysis10
Bottleneck Analysis

Adapter

~70 MBps

PCI

~110 MBps

Adapter

Memory

Read/Write

~250 MBps

Adapter

PCI

Adapter

  • NTFS Read/Write
  • 18 Ultra 3 SCSI on 4 strings (2x4 and 2x5) 3 PCI 64

~ 155 MBps Unbuffered read (175 raw)

~ 95 MBps Unbuffered write

Good, but 10x down from our UNIX brethren (SGI, SUN)

155 MBps

Gray - Microsoft @ LANL 12/17/98

sandia compaq servernet nt sort
Sandia/Compaq/ServerNet/NT Sort
  • Sort 1.1 Terabyte (13 Billion records) in 47 minutes
  • 68 nodes (dual 450 Mhz processors)543 disks, 1.5 M$
  • 1.2 GBps network rap (2.8 GBps pap)
  • 5.2 GBps of disk rap (same as pap)
  • (rap=real application performance,pap= peak advertised performance)

Gray - Microsoft @ LANL 12/17/98

progress on sorting nt now leads both price and performance
Progress on Sorting: NT now leads both price and performance
  • Speedup comes from Moore’s law 40%/year
  • Processor/Disk/Network arrays: 60%/year (this is a software speedup).

Gray - Microsoft @ LANL 12/17/98

the microsoft terraserver hardware
Compaq AlphaServer 8400

8x400Mhz Alpha cpus

10 GB DRAM

324 9.2 GB StorageWorks Disks

3 TB raw, 2.4 TB of RAID5

STK 9710 tape robot (4 TB)

WindowsNT 4 EE, SQL Server 7.0

The Microsoft TerraServer Hardware

Gray - Microsoft @ LANL 12/17/98

terraserver lots of web hits
TerraServer: Lots of Web Hits

35

Total

Average

Peak

71

30

Hits

1,065 m

8.1 m

29 m

25

Queries

877 m

6.7 m

18 m

Sessions

20

Hit

Count

Page View

Images

DB Query

742 m

5.6m

15 m

15

Image

Page Views

170 m

1.3 m

6.6 m

10

Users

76 k

6.4 m

48 k

5

Sessions

10 m

77 k

125 k

0

7/6/98

8/3/98

9/7/98

6/22/98

6/29/98

7/13/98

7/20/98

7/27/98

8/10/98

8/17/98

8/24/98

8/31/98

9/14/98

9/21/98

9/28/98

10/5/98

10/12/98

10/19/98

10/26/98

Date

  • A billion web hits!
  • 1 TB, largest SQL DB on the Web
  • 100 Qps average, 1,000 Qps peak
  • 877 M SQL queries so far

Gray - Microsoft @ LANL 12/17/98

sql 7 terraserver availability
SQL 7 TerraServer Availability
  • Operating for 4 months: 3,133 hrs
  • Unscheduled outage: 36.5 minutes: 99.98% scheduled up
  • Scheduled outage: 60 minutes
  • Availability: 99.95% overall up
  • No NT failures (ever)
  • One SQL7 Beta2 bug
  • No failures in Aug, Oct

Gray - Microsoft @ LANL 12/17/98

backup restore
Backup / Restore

Gray - Microsoft @ LANL 12/17/98

ncsa super cluster
NCSA Super Cluster
  • National Center for Supercomputing ApplicationsUniversity of Illinois @ Urbana
  • 512 Pentium II cpus, 2,096 disks, SAN
  • Compaq + HP +Myricom + WindowsNT
  • A Super Computer for 3M$
  • Classic Fortran/MPI programming
  • DCOM programming model

http://access.ncsa.uiuc.edu/CoverStories/SuperCluster/super.html

Gray - Microsoft @ LANL 12/17/98

data rivers split merge streams
Data Rivers: Split + Merge Streams

N X M Data Streams

M Consumers

N producers

River

  • Producers add records to the river,
  • Consumers consume records from the river
  • Purely sequential programming.
  • River does flow control and buffering
      • does partition and merge of data records
  • River = Split/Merge in Gamma = Exchange operator in Volcano /SQL Server.

Gray - Microsoft @ LANL 12/17/98

generalization object oriented rivers
Generalization: Object-oriented Rivers
  • Rivers transport sub-class of record-set (= stream of objects)
    • record type and partitioning are part of subclass
  • Node transformers are data pumps
    • an object with river inputs and outputs
    • do late-binding to record-type
  • Programming becomes data flow programming
    • specify the pipelines
  • Compiler/Scheduler does data partitioning and “transformer” placement

Gray - Microsoft @ LANL 12/17/98

nt cluster sort as a prototype
NT Cluster Sort as a Prototype
  • Using
    • data generation and
    • sort as a prototypical app
  • “Hello world” of distributed processing
  • goal: easy install & execute

Gray - Microsoft @ LANL 12/17/98

remote install
Remote Install
  • Add Registry entry to each remote node.

RegConnectRegistry()

RegCreateKeyEx()

Gray - Microsoft @ LANL 12/17/98

cluster startupexecution
Cluster StartupExecution

MULT_QI

COSERVERINFO

HANDLE

HANDLE

HANDLE

Sort()

Sort()

Sort()

  • Setup :
      • MULTI_QI struct
      • COSERVERINFO struct
  • CoCreateInstanceEx()
  • Retrieve remote object handle
  • from MULTI_QI struct
  • Invoke methods as usual

Gray - Microsoft @ LANL 12/17/98

cluster sort conceptual model
Cluster Sort Conceptual Model

AAA

AAA

AAA

AAA

AAA

AAA

BBB

BBB

BBB

BBB

BBB

BBB

CCC

CCC

CCC

CCC

CCC

CCC

  • Multiple Data Sources
  • Multiple Data Destinations
  • Multiple nodes
  • Disks -> Sockets -> Disk -> Disk

A

AAA

BBB

CCC

B

C

AAA

BBB

CCC

AAA

BBB

CCC

Gray - Microsoft @ LANL 12/17/98

ad