1 / 43

Operational Standards for Real Applications in Grid Challenges

This article discusses the importance of following operational standards based on real applications in order to achieve the vision of grid computing. It looks at past visions and challenges, the growth of the internet, and the need for increased capacity and new services. The article also explores the use of smart clients and the National Virtual Observatory in scientific data analysis.

dheller
Download Presentation

Operational Standards for Real Applications in Grid Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid ChallengesIt’s the vision, stupid…but it NEEDS TO be followed by operational standardsbased on real applications… The Global Grid Forum 25 June 2003 Gordon Bell Microsoft Corporation

  2. A quick look at some past visionsand a challenge • NREN >> Internet • WWW • Challenge: Will match any Grid enabled application that wins a Gordon Bell Prize for parallelism

  3. FCCSET NREN Plan 11/1987 10G- 1G- 100M- 10M- 1M- 100K- 10K- 3 G Optical a factor of 1000 makes a difference 45 M Phase 2 1.5 M Phase 1 56K 1988 1990 1992 1994 1996 1998 2000

  4. Originating Bandwidth (Gb/s)U.S. Interstate Comm. traffic L Roberts ’92ARPAnet Goals c1972 = Grid Goals 10,000- 1,000- 100- 10- 1- Video Conf. Voice Video on Demand Email NSF bb• FAX Broadcast TV |1990 | |2000 | |2010 | |2020

  5. Growth in hype vs reality WWW books, newspapers Infoway regulation Infoway speculation “how great it’ll be” (politicians , telecoms & futurists) Infoway addiction conferences lawsuits c 1995 Data from Gordon’s WAG

  6. Articles per newspaper versusorders per second sent via Internet orders per second articles per newspaper c 1995 Data from Gordon’s WAG

  7. Articles about security, privacy, & fraud versus commerce ($M) actual commerce articles about risk and NOT doing commerce organized crime on Internet c 1995 Data from Gordon’s WAG

  8. Increased Demand Increase Capacity(circuits & bw) Create new service Lower response time WWW Audio Video Grids Voice! The virtuous cycle of bandwidth supply and demand Standards IP Telnet & FTP EMAIL Video Conf. FTP Web Svcs

  9. Grid Book c1998 from 1996 www.mkp.com/grids The Globus Project™ www.globus.org OGSA www.globus.org/ogsa Global Grid Forum www.gridforum.org Grid Computing 2003 For More Information 651 pp. 22 chapters, 41 authors 1080 pages 43 chapters, O(100) authors

  10. Progress...a review • Grid started out with great promise…c1998Interesting use at NASA for coupled programs • NMI (National Middleware Infrastructure)…State_Tools.gov, funded by NSF.govclearly open, clearly not “free” not IETF model • Tools vs. standards & evolving working code • Some examples: • C1980: Seti@home, folding@home, >> Napster p2p • 2001 15 TB Terraserver > Terraservice w/Web Services • 2003 Alex Szelay & Jim Gray: Skyserver/skyservice • Cornell Theory Center Web Services based apps • NEES—good poster child. An XML task • GRADs and Teragrid… dream or research or just $$s?

  11. TerraServer c1998:The “Whole Earth” Database

  12. To the rescue! TerraServer Experiencec2001 • Successful Web Site • 50,000 daily users satisfied with “human-accessible” data • 59 GB imagery transmitted daily • New Feature Requests • Programmable access to meta-data • User selectable image sizes, i.e. “a map server” • Permission to use TerraServer data within server applications

  13. Smart Clients WindowsForms .NET Framework ADO.NET .NET TerraService Architecture HTML Map UI Web Forms Standard Browsers Image/jpeg Existing DB Server Map Server Http Handler 668 m Rows SQL 20001.0 TB Db Image/jpeg TerraServer Web Service SQL 20001.0 TB Db XML SQL 20001.0 TB Db OLEDB

  14. Data Intensive Science: the Next Frontier The W.M. Keck Fellowsin Advanced Scientific Data Analysis Alex SzalayThe Johns Hopkins UniversityDepartment of Physics and Astronomy

  15. National Virtual Observatory • NSF ITR project, “Building the Framework for the National Virtual Observatory” is a collaboration of 17 funded and 3 unfunded organizations • Astronomy data centers • National observatories • Supercomputer centers • University departments • Computer science/information technology specialists • PI and project director: Alex Szalay (JHU) • CoPI: Roy Williams (Caltech/CACR)

  16. Scientific Data Exploration • Thousand years ago: science was empirical • describing natural phenomena • Last few hundred years: theoretical branch • using models, generalizations • Last few decades: a computational branch • simulating complex phenomena • Today: data exploration is emerging • synthesizing theory, experiment and computation with advanced data management and statistics

  17. Living in an Exponential World • Astronomers have a few hundred TB now • 1 pixel (byte) / sq arc second ~ 4TB • Multi-spectral, temporal, … → 1PB • They mine it looking fornew (kinds of) objects, more of interesting ones (quasars), density variations in 400-D space, correlations in 400-D space • Data doubles every year • Data is public after 1 year • So, 50% of the data is public • Same trend appears in all sciences

  18. ROSAT ~keV DSS Optical IRAS 25m 2MASS 2m GB 6cm WENSS 92cm NVSS 20cm IRAS 100m Why Is Astronomy Special? • It has no commercial value • No privacy concerns, freely share results with others • Great for experimenting with algorithms • It is real and well documented • High-dimensional (with confidence intervals) • Spatial, temporal • Diverse and distributed • Many different instruments from many different places and many different times • The questions are interesting • There is a lot of it (soon petabytes) • GB: It is not over-funded aka it’s poor

  19. Making Discoveries • When and where are discoveries made? • Always at the edges and boundaries • Going deeper, collecting more data, using more colors…. • Metcalfe’s law • Utility of computer networks grows as the number of possible connections: O(N2) • VO: Federation of N archives • Possibilities for new discoveries grow as O(N2) • Current sky surveys have proven this • Very early discoveries from SDSS, 2MASS, DPOSS

  20. What can be learned from Sky Server? • It’s about data, not about harvesting flops • 1-2 hr. query programs versus 1 wk programs based on grep • 10 minute runs versus 3 day compute & searches • Database viewpoint. 100x speed-ups • Avoid costly re-computation and searches • Use indices and PARALLEL I/O. Read / Write >>1. • Parallelism is automatic, transparent, and just depends on the number of computers/disks. • Limited experience and talent to use dbases.

  21. Soon: The Virtual Observatory • Many new surveys are coming • SDSS is a dry run for the next ones • LSST will be 5TB/night • All the data will be on the Internet • ftp, web services… • Data and applications will be associated with the instruments • Distributed world wide, cross-indexed • Federation is a must • Will be the best telescope in the world • World Wide Telescope • Finds the “needle in the haystack” • Successful demonstrations in Jan’03

  22. Emerging Concepts • Standardizing distributed data access • Web Services, supported on all platforms • XML: Extensible Markup Language • SOAP: Simple Object Access Protocol • WSDL: Web Services Description Language • Standardizing distributed computing • Grid Services • Custom configure remote computing dynamically • Build your own remote computer, and discard • Virtual Data: new data sets on demand • Both needed for Data Exploration

  23. Computational Science Simulations based on Web Services Gerd Heber Cornell Theory Center heber@tc.cornell.edu

  24. Three Flavors of Adaptivity • Application-level • Mathematical model • High/low confidence • Algorithm-level • Discretization method • Solution technique • System-level • Resource availability • Fault tolerance

  25. The Problem • Do distributed,coupled and adaptive multi-physics simulations of • Mechanics of chemically-reacting flows • (Damage) Thermo-Mechanics of solids • Components provided as Web Services

  26. Geography • Cornell University • Theory Center • Department of Computer Science • Department of Civil Engineering • University of Alabama • Mississippi State University • College of William and Mary

  27. Workflow

  28. Components • MiniCAD • Meshers • Surface (Delaunay, quality guarantees) • Volume (Dmesh, Jmesh, Gmesh) • Fluid/Thermal simulation (Loci, CHEM) • Thermo-mechanical component (CPTC) • Fracture mechanics • Visualization (OpenDX + SQL Server)

  29. Web Services • “Web Services are self-contained, modular applications that can be described, published, located, and invoked over a network, …” (IBM) • Service oriented architecture: Publish, find, bind • XML, SOAP, UDDI, WSDL

  30. Features and Requirements • Distributed expertise • No porting • Network accessibility (“firewall compliant”) • Platform and language neutrality • Security • Industry standards • Metadata • State • Students shouldn’t waste too much time with coding!

  31. GrADS Vision • Build a National Problem-Solving System on the Grid • Transparent to the user, who sees a problem-solving system • Software Support for Application Development on Grids • Goal: Design and build programming systems for the Grid that broaden the community of users who can develop and run applications in this complex environment • Challenges: • Presenting a high-level application development interface* • If programming is hard, the Grid will not not reach its potential • Designing and constructing applications for adaptability • Late mapping of applications to Grid resources • Monitoring and control of performance • When should the application be interrupted and remapped? *GB note: This is a superset of the previously unsolved clusters programming problem!

  32. Performance Feedback Real-time Performance Performance Problem Software Monitor Components Resource Config- Whole- Source Grid Negotiator urable Appli- Program Negotiation Runtime Object Compiler cation System Scheduler Program Binder Libraries GrADSoft Architecture

  33. Network for Earthquake Eng. Simulation • NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other • On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org

  34. “Scales Away” spans organizations & geographies “Scales Out” by adding machines “Scales Up” on large systems “Scales In” on a machine “Scales Down” to devices A Universal Architecture for Web Services… Microsoft Vision Security Reliable Messaging Transactions Routing … Messaging Infrastructure Distributed applications Vertical processes Embedded systems Network equipment … 39

  35. Web Services: Level IFoundation to Build Upon • Basic profile • Defined by WS-I • XML, SOAP, WSDL, UDDI • Broad vendor support • WS-I assures widespread compatibility

  36. Level II Secure, Reliable, Transacted Connected Applications Business Process Management … Secure Reliable Transacted Metadata Messaging XML Transports

  37. Level IIIFrom Infrastructure to Solutions • Application schemas • Domain specific profiles • Vertical industry services

  38. Vison: Community/Data-Centric ComputingVersus Machine-Centered Centers • Goal: Enable technical communities to create and take responsibility for their own computing environments of personal, data, and program collaboration and distribution • Design based on technology and cost, e.g. networking, apps programs maintenance, databases, and providing 24x7 web and other services • Many alternative styles and locations are possible • Service from existing centers, including many state centers • Software vendors could be encouraged to supply apps web services • NCAR style center based on shared data and apps • Instrument- and model-based databases. Both central & distributed when multiple viewpoints create the whole. • Wholly distributed services supplied by many individual groups

  39. Community/Data Centric: “web service” • Community is responsible • Planned & budget as resources • Responsible for its infrastructure • Apps are from community • Computing is integral to work • In sync with technologies • 1-3 Tflops/$M; 1-3 PBytes/$M to buy smallish Tflops & PBytes. • New scalables are “centers” • Community can afford and evolve • Dedicated to a community • Program, data & database centric • May be aligned with instruments or other community activities • Output = web service; Can communities form that can supply services?

  40. Commitment to standards • A general architecture comes much from understanding the problems • Understanding the problems comes from actually solving such problems • This is bottom-up, based on experience • Microsoft is committed to develop community-wide web services standards… • Is the Grid Forum equally committed?

  41. The EndHow can GRIDs become a real, useful, computer structure?Get a life.Use the standards and tools. Adopt an application and/or community…now!

More Related