1 / 25

Gridflows and SDSC Matrix

Gridflows and SDSC Matrix. Arun swaran Jagatheesan arun@sdsc.edu San Diego Supercomputer Center (SDSC) University of California, San Diego. 10 th Annual NPACI/SDSC Summer Computing Institute August 23-27, 2004, Sun Diego, California, USA. “Infotainment” Outline. Infotainment Sun Diego

amie
Download Presentation

Gridflows and SDSC Matrix

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gridflows and SDSC Matrix Arun swaran Jagatheesan arun@sdsc.edu San Diego Supercomputer Center (SDSC) University of California, San Diego 10th Annual NPACI/SDSC Summer Computing Institute August 23-27, 2004, Sun Diego, California, USA

  2. “Infotainment” Outline • Infotainment • Sun Diego • Let’s Vote • Results • Action based on results

  3. Sun Diego

  4. You Decide • Should we sit here and browse the web and talk Computer Science Or • Go out, and enjoy Sun Diego?

  5. Let’s Decide Let us listen to the Matrix Gridflow Talk Let us go out and ‘njoy San Diego Let us make sure all the websites are working today

  6. Let’s Decide Let us listen to the Matrix Gridflow Talk Let us make sure all the websites are working today Let us go out and ‘njoy San Diego

  7. Back UP Slides • Just in case, there are some people (like me) who decided to choose to go with the talk instead of going around San Diego…

  8. Outline of the Talk • Uses of Grid and Workflow • Do you really need it? • When to use and when not to use? • What is needed to do Gridflows • Scientists and Grids • Workflow and Grids • Data Grid Language • SDSC Matrix • Tools

  9. Pipeline could be triggered by input at data source or by a data request from user Pipeline could be triggered by input at data source or by a data request from user Data handling pipeline in SCEC (data  information pipeline) Metadata derivation Ingest Data Ingest Metadata Determine analysis pipeline Initiate automated analysis Use the optimal set of resources based on the task – on demand Organize result data into distributed data grid collections All gridflow activities stored for data flow provenance

  10. Digital entities Meta-data Services State Data  Discovery New data Digital entities updates relationships among data in collections Meta-data Services invoked to analyze new relationships Services DGMS applications get notified of state updates State

  11. What they want? We know the business (scientific) process CyberInfrastructure is all we care (why bother about atoms or DNA)

  12. What they want? Use DGL to describe your process logic with abstract references to datagrid infrastructure dependencies

  13. Gridflows • Grid Workflow (Gridflow) is the automation of a execution pipeline in which data or tasks are processed through multiple autonomous grid resources according to a set of procedural rules • Gridflows are executed on resources that are dynamically obtained through confluence of one or more autonomous administrative domains (peers)

  14. Gridflow Language and CS Domains • Compiler Design • Variable scope definition, Recursive Grammar, Execution Stack Management, • Data Modeling • Schema definitions for gridflow patterns • Grid Computing • Data Grid data types, Virtual Organization, basic operations, … • Other concepts and Standards • Rules, W3C XQuery, GGF JSDL?

  15. Gridflow Language Requirements • High level Abstract descriptions • Abstract description of cyberinfrastructure dependencies • Simple yet flexible • Flexible to describe complex requirements (no brute force) • Gridflow dependency patterns • Based on execution structure and data semantics • (Parallel, Sequential, fork-new), (milestones, for-each, switch-case).. • Asynchronous execution • For long-run requests • Querying using existing standard • XQuery

  16. Gridflow Language Requirements • Process meta data and annotations • Runtime definition, update and querying of meta-data • Runtime Management of Gridflows • Stop gridflow at run time • Partitioning • Facility in language to divide a gridflow request to multiple requests • Import descriptions • Refer other gridflows in execution

  17. Data Grid Language (DGL) • DGL is just a language specification • Can be used in any commercial or academic data grid software • DGL describes gridflow description and dependencies

  18. Gridflow Process I Gridflow Description Data Grid Language End User using DGBuilder

  19. Planner Concrete Gridflow Using Data Grid Language Gridflow Process II Abstract Gridflow using Data Grid Language

  20. Gridflow Processor Concrete Gridflow Using Data Grid Language Gridflow Process III Gridflow P2P Network

  21. Flow Logic Structure Pre-Process Structure – parallel, sequential etc., ECA Rule based definitions Recursive definition of runnables as either data operation or as a executable process (Job) Post-Process Meta-data DGL Structure (data model) Runnable

  22. Operations in DGL • Execute Process (DAG, java, WSDL, etc) • Very generic Datagrid operations • Copy directories/files • Change Permissions (Chmod) • Create directory/file/archive • Delete directory/file/archive • Ingest/download URl or any data source • Replicate, Rename, List • SeekNWrite, SeekNRead • Ingest, Query Any type of Metadata Approved by the FDA and SRB for you

  23. Components of DGL • DGL document is either a request or a response • Data Grid Request • Could be a Flow (aggregation of operations) • Or could be a Status Query • Data Grid Response • Could be a Flow Acknowledgement • Or could be a Status Response • Can be made Synchronous or Asynchronous • Flexibility for any type of Implementation

  24. Summary • A standard description language is Needed • Requirements of the language • Data Grid Language (DGL) • Recursive definition of flows and steps • Metadata or variable scopes • Rules • Can be partitioned (sub-divided) • Components of Data Grid Language • Next step: Talk to Scheduling or Heuristics people

  25. got ideas/suggestions?Contact: SDSC Matrix project arun@sdsc.edu Google key word: SDSC Gridflow Click here to start the slide show again

More Related