Scaleability scale up and scale out
Download
1 / 23

Gray - Microsoft LANL 121798 - PowerPoint PPT Presentation


  • 272 Views
  • Updated On :

Scaleability Scale Up and Scale Out SMP Super Server Departmental Server Personal System Grow Up with SMP 4xP6 is now standard Grow Out with Cluster Cluster has inexpensive parts Cluster of PCs Thesis Many little beat few big 3 1 MM 10 nano-second ram 10 microsecond ram

Related searches for Gray - Microsoft LANL 121798

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Gray - Microsoft LANL 121798' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Scaleability scale up and scale out l.jpg
ScaleabilityScale Up and Scale Out

SMP

Super Server

Departmental

Server

Personal

System

Grow Up with SMP

4xP6 is now standard

Grow Out with Cluster

Cluster has inexpensive parts

Cluster

of PCs


Thesis many little beat few big l.jpg
ThesisMany little beat few big

3

1 MM

10 nano-second ram

10 microsecond ram

10 millisecond disc

10 second tape archive

$1 million

$10 K

$100 K

Pico Processor

Nano

Micro

10 pico-second ram

1 MB

Mini

Mainframe

10

0

MB

1

0 GB

1

TB

1

00 TB

1.8"

2.5"

3.5"

5.25"

1 M SPECmarks, 1TFLOP

106 clocks to bulk ram

Event-horizon on chip

VM reincarnated

Multi-program cache,

On-Chip SMP

9"

14"

  • Smoking, hairy golf ball

  • How to connect the many little parts?

  • How to program the many little parts?

  • Fault tolerance & Management?

Gray - Microsoft @ LANL 12/17/98


4 b pc s 1 bips 1gb dram 10 gb disk 1 gbps net b g the bricks of cyberspace l.jpg
4 B PC’s (1 Bips, .1GB dram, 10 GB disk 1 Gbps Net, B=G)The Bricks of Cyberspace

  • Cost 1,000 $

  • Come with

    • NT

    • DBMS

    • High speed Net

    • System management

    • GUI / OOUI

    • Tools

  • Compatible with everyone else

  • CyberBricks

Gray - Microsoft @ LANL 12/17/98


Computers shrink to a point l.jpg

Kilo

Mega

Giga

Tera

Peta

Exa

Zetta

Yotta

Computers shrink to a point

  • Disks 100x in 10 years 2 TB 3.5” drive

  • Shrink to 1” is 200GB

  • Disk is super computer!

  • This is already true of printers and “terminals”

Gray - Microsoft @ LANL 12/17/98


Microsoft com 150x4 nodes a crowd l.jpg
Microsoft.com: ~150x4 nodes: a crowd

Building 11

Staging Servers

(7)

Ave CFG:

4xP6,

Internal WWW

Ave CFG:

4xP5,

European Data Center

premium.microsoft.com

IDC Staging Servers

512 RAM,

www.microsoft.com

30 GB HD

(1)

MOSWest

(3)

Ave CFG:

4xP6,

Ave CFG:

4xP6,

512 RAM,

FTP Servers

512 RAM,

SQLNet

30 GB HD

Ave CFG:

4xP5,

SQL SERVERS

50 GB HD

Feeder LAN

512 RAM,

SQL Consolidators

(2)

Router

Download

30 GB HD

DMZ Staging Servers

Ave CFG:

Replication

4xP6,

Ave CFG:

4xP6,

512 RAM,

FTP

Router

1 GB RAM,

Live SQL Servers

160 GB HD

Download Server

160 GB HD

SQL Reporting

Ave Cost:

$83K

Ave CFG:

4xP6,

(1)

MOSWest

Switched

Ave CFG:

FY98 Fcst:

4xP6,

2

512 RAM,

Live SQL Server

Ave CFG:

Admin LAN

4xP6,

Ethernet

512 RAM,

160 GB HD

512 RAM,

160 GB HD

Ave Cost:

$83K

50 GB HD

FY98 Fcst:

12

search.microsoft.com

msid.msn.com

(1)

msid.msn.com

register.microsoft.com

www.microsoft.com

(1)

(1)

www.microsoft.com

(2)

(4)

Ave CFG:

4xP6,

Router

(4)

512 RAM,

search.microsoft.com

Ave CFG:

4xP6,

30 GB HD

Japan Data Center

(3)

512 RAM,

SQL SERVERS

www.microsoft.com

50 GB HD

Ave CFG:

premium.microsoft.com

4xP6,

(2)

(3)

512 RAM,

Ave CFG:

4xP6,

(1)

30 GB HD

home.microsoft.com

512 RAM,

Ave CFG:

4xP6,

home.microsoft.com

Ave CFG:

4xP6,

Ave Cost:

$28K

160 GB HD

FDDI Ring

512 RAM,

(3)

512 RAM,

FY98 Fcst:

(4)

7

(MIS2)

50 GB HD

premium.microsoft.com

30 GB HD

Ave CFG:

4xP6

(2)

msid.msn.com

512 RAM

Ave CFG:

4xP6,

activex.microsoft.com

28 GB HD

512 RAM,

(1)

(2)

FDDI Ring

Ave CFG:

4xP6,

30 GB HD

Switched

(MIS1)

512 RAM,

Ave CFG:

4xP6,

Ethernet

30 GB HD

256 RAM,

30 GB HD

FTP

Ave Cost:

$25K

cdm.microsoft.com

Download Server

Ave CFG:

FY98 Fcst:

4xP5,

2

(1)

256 RAM,

Router

(1)

HTTP

search.microsoft.com

12 GB HD

Download Servers

(2)

(2)

Router

Router

Internet

msid.msn.com

Router

(1)

2

Primary

2

Router

Gigaswitch

OC3

Ethernet

premium.microsoft.com

(100Mb/Sec Each)

Internet

(100 Mb/Sec Each)

Router

(1)

www.microsoft.com

Router

(3)

Secondary

Gigaswitch

13

Router

DS3

Router

FTP.microsoft.com

(45 Mb/Sec Each)

(3)

FDDI Ring

Ave CFG:

4xP5,

home.microsoft.com

(MIS3)

www.microsoft.com

msid.msn.com

512 RAM,

(2)

30 GB HD

(5)

(1)

Internet

register.microsoft.com

Ave CFG:

4xP5,

FDDI Ring

(2)

256 RAM,

(MIS4)

20 GB HD

register.microsoft.com

home.microsoft.com

support.microsoft.com

(1)

(5)

register.msn.com

(2)

(2)

Ave CFG:

4xP6,

support.microsoft.com

512 RAM,

search.microsoft.com

(1)

30 GB HD

(3)

Gray - Microsoft @ LANL 12/17/98


Hotmail 400 computers crowd l.jpg
HotMail: ~400 Computers Crowd

Gray - Microsoft @ LANL 12/17/98


Db clusters crowds l.jpg
DB Clusters (crowds)

  • 16-node Cluster

    • 64 cpus

    • 2 TB of disk

    • Decision support

  • 45-node Cluster

    • 140 cpus

    • 14 GB DRAM

    • 4 TB RAID disk

    • OLTP (Debit Credit)

      • 1 B tpd (14 k tps)

Gray - Microsoft @ LANL 12/17/98


Slide8 l.jpg

Windows NT Versus UNIXBest Results on an SMP: SemiLog plot shows 3x (2 year) lead by UNIX Does not show Oracle/Alpha Cluster at 100,000 tpmCAll these numbers are off-scale huge (20,000 active users?)

Gray - Microsoft @ LANL 12/17/98


Bottleneck analysis l.jpg
Bottleneck Analysis

  • Drawn to linear scale

Theoretical

Bus Bandwidth

422MBps = 66 Mhz x 64 bits

MemoryRead/Write

~150 MBps

MemCopy

~50 MBps

Disk R/W

~9MBps

Gray - Microsoft @ LANL 12/17/98


Bottleneck analysis10 l.jpg
Bottleneck Analysis

Adapter

~70 MBps

PCI

~110 MBps

Adapter

Memory

Read/Write

~250 MBps

Adapter

PCI

Adapter

  • NTFS Read/Write

  • 18 Ultra 3 SCSI on 4 strings (2x4 and 2x5) 3 PCI 64

    ~ 155 MBps Unbuffered read (175 raw)

    ~ 95 MBps Unbuffered write

    Good, but 10x down from our UNIX brethren (SGI, SUN)

155 MBps

Gray - Microsoft @ LANL 12/17/98


Sandia compaq servernet nt sort l.jpg
Sandia/Compaq/ServerNet/NT Sort

  • Sort 1.1 Terabyte (13 Billion records) in 47 minutes

  • 68 nodes (dual 450 Mhz processors)543 disks, 1.5 M$

  • 1.2 GBps network rap (2.8 GBps pap)

  • 5.2 GBps of disk rap (same as pap)

  • (rap=real application performance,pap= peak advertised performance)

Gray - Microsoft @ LANL 12/17/98


Progress on sorting nt now leads both price and performance l.jpg
Progress on Sorting: NT now leads both price and performance

  • Speedup comes from Moore’s law 40%/year

  • Processor/Disk/Network arrays: 60%/year (this is a software speedup).

Gray - Microsoft @ LANL 12/17/98


The microsoft terraserver hardware l.jpg

Compaq AlphaServer 8400

8x400Mhz Alpha cpus

10 GB DRAM

324 9.2 GB StorageWorks Disks

3 TB raw, 2.4 TB of RAID5

STK 9710 tape robot (4 TB)

WindowsNT 4 EE, SQL Server 7.0

The Microsoft TerraServer Hardware

Gray - Microsoft @ LANL 12/17/98


Terraserver lots of web hits l.jpg
TerraServer: Lots of Web Hits

35

Total

Average

Peak

71

30

Hits

1,065 m

8.1 m

29 m

25

Queries

877 m

6.7 m

18 m

Sessions

20

Hit

Count

Page View

Images

DB Query

742 m

5.6m

15 m

15

Image

Page Views

170 m

1.3 m

6.6 m

10

Users

76 k

6.4 m

48 k

5

Sessions

10 m

77 k

125 k

0

7/6/98

8/3/98

9/7/98

6/22/98

6/29/98

7/13/98

7/20/98

7/27/98

8/10/98

8/17/98

8/24/98

8/31/98

9/14/98

9/21/98

9/28/98

10/5/98

10/12/98

10/19/98

10/26/98

Date

  • A billion web hits!

  • 1 TB, largest SQL DB on the Web

  • 100 Qps average, 1,000 Qps peak

  • 877 M SQL queries so far

Gray - Microsoft @ LANL 12/17/98


Sql 7 terraserver availability l.jpg
SQL 7 TerraServer Availability

  • Operating for 4 months: 3,133 hrs

  • Unscheduled outage: 36.5 minutes: 99.98% scheduled up

  • Scheduled outage: 60 minutes

  • Availability: 99.95% overall up

  • No NT failures (ever)

  • One SQL7 Beta2 bug

  • No failures in Aug, Oct

Gray - Microsoft @ LANL 12/17/98


Backup restore l.jpg
Backup / Restore

Gray - Microsoft @ LANL 12/17/98


Ncsa super cluster l.jpg
NCSA Super Cluster

  • National Center for Supercomputing ApplicationsUniversity of Illinois @ Urbana

  • 512 Pentium II cpus, 2,096 disks, SAN

  • Compaq + HP +Myricom + WindowsNT

  • A Super Computer for 3M$

  • Classic Fortran/MPI programming

  • DCOM programming model

http://access.ncsa.uiuc.edu/CoverStories/SuperCluster/super.html

Gray - Microsoft @ LANL 12/17/98


Data rivers split merge streams l.jpg
Data Rivers: Split + Merge Streams

N X M Data Streams

M Consumers

N producers

River

  • Producers add records to the river,

  • Consumers consume records from the river

  • Purely sequential programming.

  • River does flow control and buffering

    • does partition and merge of data records

  • River = Split/Merge in Gamma = Exchange operator in Volcano /SQL Server.

  • Gray - Microsoft @ LANL 12/17/98


    Generalization object oriented rivers l.jpg
    Generalization: Object-oriented Rivers

    • Rivers transport sub-class of record-set (= stream of objects)

      • record type and partitioning are part of subclass

    • Node transformers are data pumps

      • an object with river inputs and outputs

      • do late-binding to record-type

    • Programming becomes data flow programming

      • specify the pipelines

    • Compiler/Scheduler does data partitioning and “transformer” placement

    Gray - Microsoft @ LANL 12/17/98


    Nt cluster sort as a prototype l.jpg
    NT Cluster Sort as a Prototype

    • Using

      • data generation and

      • sort as a prototypical app

    • “Hello world” of distributed processing

    • goal: easy install & execute

    Gray - Microsoft @ LANL 12/17/98


    Remote install l.jpg
    Remote Install

    • Add Registry entry to each remote node.

    RegConnectRegistry()

    RegCreateKeyEx()

    Gray - Microsoft @ LANL 12/17/98


    Cluster startupexecution l.jpg
    Cluster StartupExecution

    MULT_QI

    COSERVERINFO

    HANDLE

    HANDLE

    HANDLE

    Sort()

    Sort()

    Sort()

    • Setup :

      • MULTI_QI struct

      • COSERVERINFO struct

    • CoCreateInstanceEx()

    • Retrieve remote object handle

    • from MULTI_QI struct

    • Invoke methods as usual

    Gray - Microsoft @ LANL 12/17/98


    Cluster sort conceptual model l.jpg
    Cluster Sort Conceptual Model

    AAA

    AAA

    AAA

    AAA

    AAA

    AAA

    BBB

    BBB

    BBB

    BBB

    BBB

    BBB

    CCC

    CCC

    CCC

    CCC

    CCC

    CCC

    • Multiple Data Sources

    • Multiple Data Destinations

    • Multiple nodes

    • Disks -> Sockets -> Disk -> Disk

    A

    AAA

    BBB

    CCC

    B

    C

    AAA

    BBB

    CCC

    AAA

    BBB

    CCC

    Gray - Microsoft @ LANL 12/17/98


    ad