microsoft large databases and grid computing n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Microsoft Large Databases and Grid Computing PowerPoint Presentation
Download Presentation
Microsoft Large Databases and Grid Computing

Loading in 2 Seconds...

play fullscreen
1 / 54

Microsoft Large Databases and Grid Computing - PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on

Microsoft Large Databases and Grid Computing. Jim Gray Microsoft Research Gray@Microsoft.com http://research.Microsoft.com/~gray 21 May 2003 . About me. in Microsoft research (located in San Francisco) A database researcher IBM, Tandem, DEC, Microsoft Work on Scalable Systems

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Microsoft Large Databases and Grid Computing' - diallo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
microsoft large databases and grid computing

Microsoft Large DatabasesandGrid Computing

Jim Gray

Microsoft Research

Gray@Microsoft.com

http://research.Microsoft.com/~gray

21 May 2003

about me
About me
  • in Microsoft research (located in San Francisco)
  • A database researcher
    • IBM, Tandem, DEC, Microsoft
  • Work on Scalable Systems
    • Building supercomputers from commodity components.
  • Do academic/government things too
    • PITAC, GriPhyn TAB, NSF/CISE, Library of Congress, …
  • For the last 4 years, been working with the astronomy community to build the World Wide Telescope.
agenda
Agenda
  • TerraServer
    • What it is
    • What we learned
    • What we are doing now.
  • SkyServer / WWT
    • What it is
    • What we learned
    • What we are doing now
  • Grid Computing
    • General comments
    • Build a web service
terraserver terraservice net
TerraServerTerraService.net
  • A photo of the United States
    • 1 meter resolution (photographic/topographic)
    • USGS data
    • Some demographic data (BestPlaces.net)
    • Home sales data
    • Linked to Encarta Encyclopedia
  • 15 TB raw, 6 TB cooked (grows 10GB/w)
  • Point, Pan, zoom interface
  • Among top 1,000 websites
    • 40k visitors/day
    • 4M queries/day
    • 3 B page views (in 5 years)
  • All in an SQL database
terraserver statistics

Daily

Average

Peak

Day

40,011

277,292

1,266,838

2,401,209

Sept ’01

June 1998 -

Oct, 2002

June ‘98

Jan ‘99

Jan ‘00

May ‘00

3,735,789

10,475,674

4,484,089

12,388,104

63,656,904

70 gb

163 gb

2,015,539,605

5,943,641,024

7,134,186,170

108tb

298 m Rows

231 m Rows

SQL 20001 Server.8 TB Db

SQL 2000.8 TB Db

217 m Rows

SQL 7.01 Server1.5 TB Db

173 m Rows

SQL 7.01.0 TB Db

SQL 7.0.75 TB Db

TerraServer Statistics

Dec ‘02

Unique Users

Page Views

Image Tiles

Db Queries

Bytes Xfered

900 m Rows

755mRows

SQL 20002.0 TB Db

SQL 20001.4 TB Db

SQL 20002.0 TB Db

SQL 20001.2 TB Db

SQL 20002.0 TB Db

SQL 7.01.0 TB Db

SQL 20001.0 TB Db

1 Server / Win NT 4.0 EE

2nd Server / Win 2k DataCenter

4 Node / Win2k Datacenter Failover Cluster

terraserver cluster

2200

2200

2200

E

E

J

J

O

O

2200

2200

2200

G

F

P

Q

K

L

2200

2200

2200

R

S

M

N

H

I

TerraServer Cluster

8 Compaq DL360 “Photon” Web Servers

One SQL database per rack

Each rack contains 4.5 TB

1 rack not in picture

18.0 TB total

Fiber SAN

Switches

Meta Data

Stored on 101 GB

“Fast, Small Disks”(18 x 18.2 GB)

SQL\Inst1

Imagery Data

Stored on 4 339 GB

“Slow, Big Disks”

(15 x 73.8 GB)

SQL\Inst2

SQL\Inst3

Added 90 72.8 GB

Disks in Feb 2001

to create 18 TB SAN

Spare

4 Compaq ProLiant 8500 Db Servers

cluster configuration
Cluster Configuration

Internet

Cisco 12000

Internet Router

TerraServer SAN

Gigibit

Ethernet

1

Database

Compaq

StorageWorks

100-Mbps

Ethernet

Compaq DL360 (10)

Cluster

MA8000/HSG80

Controllers (3)

Summit 7i

Switch (2)

2

Compaq DL360 (6)

(Windows 2000 Web Servers)

TerraServer.microsoft.com

Compaq

SANswitch

by Brocade

Communications

Internet

Extreme

Networks

Summit 48

Switch

ADIC

LTO

Tape

Library

3

Compaq

ProLiant 8500

(4)

Microsoft

Corporate LAN

terraserver becomes a web service terraserver net terraservice net
TerraServer Becomes a Web ServiceTerraServer.net -> TerraService.Net
  • Web server is for people.
  • Web Service is for programs
    • The end of screen scraping
    • No faking a URL: pass real parameters.
    • No parsing the answer: data formatted into your address space.
  • Hundreds of users but a specific example:
    • US Department of Agriculture
data gateway functional overview

Catalog Service

Data Gateway Functional Overview

ITC - Fort Collins, Colorado

NCGC - Fort Worth, Texas

Customer Orders Data

Terra

Service

Billing Services

Soil Data Viewer

Navigation Service

Rimage CD Service

XML

XML

XML

FTP Services

ASP

Ship Service

<<Requests Products>>

Package Service

Send order info

Order Placer

validate (dtd)

Insert into SQL

@@Identity / GUID to client

return est time

raise OrderMgr.event

Product Catalog Updates

Geospatial Data

Order Database

Data Services

Logger

XML Request for data

Called by anyone

rasies to stats svc'

Item Broker

Selects from

Listen for OrderPlacer Raised

Event

Select sequenced Item

Acknowledges item ready for delivery

Output XML

rasie event : stats.delivery start

custom end product
Custom End Product

Web Soil Data Viewer

XML Soil Report

Soil Interpretation Map

slide12

Web Server - COM+ Applications

ArcIMS Connector

WebSDV

IMSNavigator

Image Retriever

Connects to ArcIMS; communication is done through ArcIMS XML (AXL)

Retrieves and processes Soils Data from the NASIS relational Database

Generates maps (JPGs) using ArcIMS

Retrieves imagery from the Microsoft TerraServer

Database Server - ESRI Spatial Data Server

ESRI

Spatial Data Engine

Database Server - Microsoft SQL Server

Business

Rules

National SoilsData

GeospatialData

Microsoft Terraserver

Terraserver

brief tour of terraservice
Brief tour of TerraService
  • Show map service
  • Show some methods
  • See

TerraService.NET: An Introduction to Web ServicesTom Barclay; Jim Gray; Eric Strand; Steve Ekblad; Jeffrey Richter, MSR TR 2002-53, pp 13, June 2002

what we learned
What We Learned
  • You can build and manage a very popular website with relatively little effort (if you do it right and have Tom Barclay)
  • Loading 20 TB takes a lot of energy
  • And you get to do it many times -- automate
  • Tape and tape software are problematic
  • Triplex and snap-shot disks works (we have never had to use it, but..)
  • The internet gives you 2-9’sServers can run at 4 9’s easily, 5 9’s with effort.
what we are doing now
What we are doing now.
  • Building with 3K$ 2TB bricks
  • 4 bricks = 1 backend
  • Triplexing systems
  • Duplexing sites.
  • 4*3*2 = 24k$ for Geoplex
  • Very simple operations model
  • See:
  • “TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange,” Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan Vandenberg, pp. 1-8, May 2002
agenda1
Agenda
  • TerraServer
    • What it is
    • What we learned
    • What we are doing now.
  • SkyServer / WWT
    • What it is
    • What we learned
    • What we are doing now
  • Grid Computing
    • General comments
    • Build a web service
skyserver skyserver sdss org
SkyServerSkyServer.SDSS.org
  • Like the TerraServer, but looking the other way: a picture of ¼ of the universe
  • Pixels +Data Mining
  • Astronomers get about 400 attributes for each “object”
  • Get Spectrograms for 1% of the objects
why astronomy data

ROSAT ~keV

DSS Optical

IRAS 25m

2MASS 2m

GB 6cm

WENSS 92cm

NVSS 20cm

IRAS 100m

Why Astronomy Data?
  • It has no commercial value
    • No privacy concerns
    • Can freely share results with others
    • Great for experimenting with algorithms
  • It is real and well documented
    • High-dimensional data (with confidence intervals)
    • Spatial data
    • Temporal data
  • Many different instruments from many different places and many different times
  • Federation is a goal
  • The questions are interesting
    • How did the universe form?
  • There is a lot of it (petabytes)
demo of skyserver
Demo of SkyServer
  • Shows standard web server
  • Pixel/image data
  • Point and click
  • Explore one object
  • Explore sets of objects (data mining)
virtual observatory http www astro caltech edu nvoconf http www voforum org
Virtual Observatoryhttp://www.astro.caltech.edu/nvoconf/http://www.voforum.org/
  • Premise: Most data is (or could be online)
  • So, the Internet is the world’s best telescope:
    • It has data on every part of the sky
    • In every measured spectral band: optical, x-ray, radio..
    • As deep as the best instruments (2 years ago).
    • It is up when you are up.The “seeing” is always great (no working at night, no clouds no moons no..).
    • It’s a smart telescope: links objects and data to literature on them.
time and spectral dimensions the multiwavelength crab nebulae
Time and Spectral DimensionsThe Multiwavelength Crab Nebulae

Crab star

1053 AD

X-ray,

optical,

infrared, and

radio

views of the nearby Crab Nebula, which is now in a state of chaotic expansion after a supernova explosion first sighted in 1054 A.D. by Chinese Astronomers.

Slide courtesy of Robert Brunner @ CalTech.

data federations of web services
Data Federations of Web Services
  • Massive datasets live near their owners:
    • Near the instrument’s software pipeline
    • Near the applications
    • Near data knowledge and curation
    • Super Computer centers become Super Data Centers
  • Each Archive publishes a web service
    • Schema: documents the data
    • Methods on objects (queries)
  • Scientists get “personalized” extracts
  • Uniform access to multiple Archives
    • A common global schema

Federation

grid and web services synergy
Grid and Web Services Synergy
  • I believe the Grid will be many web services share data (computrons are free)
  • IETF standards Provide
    • Naming
    • Authorization / Security / Privacy
    • Distributed Objects

Discovery, Definition, Invocation, Object Model

    • Higher level services: workflow, transactions, DB,..
  • Synergy: commercial Internet & Grid tools
web services the key
Web Services: The Key?

Your

program

Web

Server

http

  • Web SERVER:
    • Given a url + parameters
    • Returns a web page (often dynamic)
  • Web SERVICE:
    • Given a XML document (soap msg)
    • Returns an XML document
    • Tools make this look like an RPC.
      • F(x,y,z) returns (u, v, w)
    • Distributed objects for the web.
    • + naming, discovery, security,..
  • Internet-scale distributed computing

Web page

Your

program

Web

Service

soap

Data

In your address space

objectin xml

skyquery a prototype
SkyQuery: a prototype
  • Defining Astronomy Objects and Methods.
  • Federated 3 Web Services(fermilab/sdss, jhu/first, Cal Tech/dposs) multi-survey cross-match Distributed query optimization (T. Malik, T. Budavari, Alex Szalay @ JHU)

http://SkyQuery.net/

  • My first web service (cutout + annotated SDSS images) online
    • http://skyservice.pha.jhu.edu/devel/ImgCutout/chart.asp
  • WWT is a great Web Services (.Net) application
    • Federating heterogeneous data sources.
    • Cooperating organizations
    • An Information At Your Fingertips challenge.
demo of image cutout service
Demo of Image Cutout Service
  • Shows image cutout
  • Show project and debugging project
  • Show hello World
  • Show “theAnswer” method
skyquery http skyquery net
SkyQuery (http://skyquery.net/)
  • Distributed Query tool using a set of services
  • Feasibility study, built in 6 weeks from scratch
    • Tanu Malik (JHU CS grad student)
    • Tamas Budavari (JHU astro postdoc)
  • Implemented in C# and .NET
  • Allows queries like:

SELECT o.objId, o.r, o.type, t.objId

FROM SDSS:PhotoPrimary o,

TWOMASS:PhotoPrimary t

WHERE XMATCH(o,t)<3.5

AND AREA(181.3,-0.76,6.5)

AND o.type=3 and (o.I - t.m_j)>2

skynode basic web services
SkyNode Basic Web Services
  • Metadata information about resources
    • Waveband
    • Sky coverage
    • Translation of names to universal dictionary (UCD)
  • Simple search patterns on the resources
    • Cone Search
    • Image mosaic
    • Unit conversions
  • Simple filtering, counting, histogramming
  • On-the-fly recalibrations
portals higher level services
Portals: Higher Level Services
  • Built on Atomic Services
  • Perform more complex tasks
  • Examples
    • Automated resource discovery
    • Cross-identifications
    • Photometric redshifts
    • Outlier detections
    • Visualization facilities
  • Goal:
    • Build custom portals in days from existing building blocks (like today in IRAF or IDL)
architecture
Architecture

Image cutout

SkyNodeFirst

Web Page

SkyQuery

SkyNode2Mass

SkyNodeSDSS

summary so far
Summary So Far
  • Some real web services deployed today
  • Easy to build & deploy
  • Services publish data, Portals unify it
  • Tools really work!
  • I’m using C# and foundation classes of VisualStudio, a great! Tool
  • A nice book explaining the ideas:(.Net Framework Essentials, Thai, Lam isbn 0-596-00302-1)
possible relevance to you

Your

program

Web

Service

soap

Data

In your address space

objectin xml

Possible Relevance to You
  • This web service stuff is REAL
  • If you have a class, It is a way to publish data: Internet Intranet
  • It is a way to find data data comes with schema no more screen scraping/parsing
  • Business model unclear
    • Your ideas go here.
what we learned1
What We Learned
  • Web services really are a breakthrough.
  • Data mining worked beautifully. SeeData Mining the SDSS SkyServer Database,”J. Gray, D. Slutz, A. Szalay, A. Thakar, P. Kuntz, C. Stoughton, MSR TR 2002-1, pp1-40, 2002.
  • You can operate a system in Chicago from San Francisco – Terminal Server is wonderful.
  • The Internet gives you 2 9’s of availability
  • TeraScale SneakerNet works well
what we are doing now1
What we are doing now.
  • Loading more data (next data release)
  • Preparing for the next generation
  • Building the WWT
  • Web Services for the Virtual Observatory,Alexander S. Szalay, Tamás Budavária, Tanu Malika, Jim Gray, and Ani Thakar, SPIE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,
  • Petabyte Scale Data Mining: Dream or Reality?,Alexander S. Szalay; Jim Gray; Jan vandenBerg, SIPE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,
  • Online Scientific Data Curation, Publication, and ArchivingJim Gray; Alexander S. Szalay; Ani R. Thakar; Christopher Stoughton; Jan vandenBerg, SPIE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,
agenda2
Agenda
  • TerraServer
    • What it is
    • What we learned
    • What we are doing now.
  • SkyServer / WWT
    • What it is
    • What we learned
    • What we are doing now
  • Grid Computing
    • General comments
    • Build a web service
the grid
The Grid
  • Computation Grid: harvest Internet cpus.
  • Data Grid: Share files
  • Application Grid: Web services
  • Access Grid: teleconferencing
the microsoft view
The Microsoft View
  • Web Services will subsume the Grid
    • The Grid will be data and servicesnot renting cycles
  • OGSA: evolution of Globus Toolkit to Web services concepts and technologies…
  • Lots of encouragement from Microsoft, IBM, Oracle, Sun
  • GGF as forum for discussion
engagement with grid community
Engagement with Grid Community
  • Goal: GXA as infrastructure for Grids
  • Working with Globus & GGF
    • Funding work at Argonne National Lab (Globus)
    • Globus Toolkit 3, and CondorG on Windows
      • http://www.globus.org/win-alpha/ (we sponsored this)
    • OGSA for .NET (prototyping)
      • http://www.globus.org/ogsa/
    • Also OGSI.NET at U. VA is very interesting
      • http://www.cs.virginia.edu/~gsw2c/ogsi.net.html
    • GGF
      • Active membershp
  • HPC .net kit – see http://www.microsoft.com/HPC
    • Part of .net server scale out development
    • Includes MPI-CH 1.2.4, distributed job scheduler,…
    • Thomas Sterling, Beowulf on Windows, MIT Press 2001
what s microsoft doing
What’s Microsoft Doing
  • Mostly .NET, W3C standards, web services, …
  • I think SkyQuery is the best web service (grid app) in GriPhyN today.
  • My stuff is grid computing
  • But…
  • Globus (GT3), OGSA, and CondorG ported to Windows (we sponsored it)
  • We have a HPC toolkit: MPI-CH 1.2.4
  • See http://www.microsoft.com/windows2000/hpc/ for many useful links
i can talk about computing on demand but best to read
I Can Talk About Computing on Demand But… Best to read
  • Distributed Computing Economics, Jim Gray, MSR-TR-2003-24, March 2003
  • The slides that follow are based on that paper.
distributed computing economics
Distributed Computing Economics
  • Why is Seti@Home a great idea
  • Why is Napster a great deal?
  • Why is the Computational Grid uneconomic
  • When does computing on demand work?
  • What is the “right” level of abstraction
  • Is the Access Grid the real killer app?
computing is free
Computing is Free
  • Computers cost 1k$ (if you shop right)
  • So 1 cpu day == 1$
  • If you pay the phone bill (and I do)Internet bandwidth costs 50 … 500$/mbps/m(not including routers and management).
  • So 1GB costs 1$ to send and 1$ to receive
why is seti@home a good deal
Why is Seti@Home a Good Deal?
  • Send 300 KB for costs 3e-4$
  • User computes for ½ day: benefit .5e-1$
  • ROI: 1500:1
why is napster a good deal
Why is Napster a Good Deal?
  • Send 5 MB costs 5e-3$
  • ½ a penny per song
  • Both sender and receiver can afford it.
  • Same logic powers web sites (Yahoo!...):
    • 1e-3$/page view advertising revenue
    • 1e-5$/page view cost of serving web page
    • 100:1 ROI
the cost of computing computers are not free
The Cost of Computing:Computers are NOT free!
  • Capital Cost of a TpcC system is mostly storage and storage software (database)
  • IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf
  • A 7.5M$ super-computer
  • Total Data Center Cost: 40% capital &facilities 60% staff(includes app development)
computing equivalents 1 buys
Computing Equivalents1 $ buys
  • 1 day of cpu time
  • 4 GB ram for a day
  • 1 GB of network bandwidth
  • 1 GB of disk storage
  • 10 M database accesses
  • 10 TB of disk access (sequential)
  • 10 TB of LAN bandwidth (bulk)
some consequences
Some consequences
  • Beowulf networking is 10,000x cheaper than WAN networkingfactors of 105 matter.
  • The cheapest and fastest way to move a Terabyte cross country is sneakernet.24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost.
  • Sending 10PB CERN data via networkis silly: buy disk bricks in Geneva, fill them, ship them – one way.

TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange

Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBerg

Microsoft Technical Report may 2002, MSR-TR-2002-54

http://research.microsoft.com/research/pubs/view.aspx?tr_id=569

how do you move a terabyte

SpeedMbps

Rent$/month

$/TBSent

Context

$/Mbps

Time/TB

0.04

40

1,000

3,086

6 years

Home phone

Home DSL

0.6

70

117

360

5 months

T1

1.5

1,200

800

2,469

2 months

T3

43

28,000

651

2,010

2 days

OC3

155

49,000

316

976

14 hours

OC 192

9600

1,920,000

200

617

14 minutes

100 Mpbs

100

1 day

Gbps

1000

2.2 hours

How Do You Move A Terabyte?
computational grid economics
Computational Grid Economics
  • To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or… it is a great thing
  • The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the programs to the data, not the data to the programs.
  • The Internet is NOT the cpu backplane.
  • The USG should not hide this economic fact from the academic/scientific research community.
computing on demand
Computing on Demand
  • Was called outsourcing / service bureaus in my youth. CSC and IBM did it.
  • Payroll is standard outsource.
  • Now we have Hotmail, Salesforce.com, Oracle.com,….
  • Works for standard apps.
  • Airlines outsource reservations.Banks outsource ATMs.
  • But Amazon, Amex, Wal-Mart, ...Can’t outsource their core competence.
  • So, COD works for commoditized services.
  • It is not a new way of doing things: think payroll.
what s the right abstraction level for internet scale distributed computing
What’s the right abstraction level for Internet Scale Distributed Computing?
  • Disk block? No too low.
  • File? No too low.
  • Database? No too low.
  • Application? Yes, of course.
    • Blast search
    • Google search
    • Send/Get eMail
    • Portals that federate astronomy archives(http://skyQuery.Net/)
  • Web Services (.NET, EJB, OGSA) give this abstraction level.
access grid
Access Grid
  • Q: What comes after the telephone?
  • A: eMail?
  • A: Instant messaging?
  • Both seem retro technology: text & emotons.
  • Access Grid could revolutionize human communication.
  • But, it needs a new idea.
  • Q: What comes after the telephone?
distributed computing economics1
Distributed Computing Economics
  • Why is Seti@Home a great idea?
  • Why is Napster a great deal?
  • Why is the Computational Grid uneconomic
  • When does computing on demand work?
  • What is the “right” level of abstraction?
  • Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

agenda3
Agenda
  • TerraServer
    • What it is
    • What we learned
    • What we are doing now.
  • SkyServer / WWT
    • What it is
    • What we learned
    • What we are doing now
  • Grid Computing
    • General comments
    • Build a web service