Towards Truly Ubiquitous Cyberinfrastructure - PowerPoint PPT Presentation

karim
towards truly ubiquitous cyberinfrastructure n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Towards Truly Ubiquitous Cyberinfrastructure PowerPoint Presentation
Download Presentation
Towards Truly Ubiquitous Cyberinfrastructure

Loading in 2 Seconds...

play fullscreen
1 / 34
Download Presentation
97 Views
Download Presentation

Towards Truly Ubiquitous Cyberinfrastructure

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Towards Truly Ubiquitous Cyberinfrastructure LAGrid ’07 Jim Myers jimmyers@ncsa.uiuc.edu Associate Director for Cyberenvironments and Technologies, National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign National Center for Supercomputing Applications

  2. National Center for Supercomputing Applications • Cyber-resources • Innovative Systems • Communities and Applications • Cyberenvironments National Center for Supercomputing Applications

  3. Outline • What’s Changing in Science? • What Role should Cyberinfrastructure (CI) play? • What Do Ubiquitous (and Persistent) mean for CI Development? • Designing for Ubiquity • Some Examples • Conclusions National Center for Supercomputing Applications

  4. How is Science Changing? • Quantitative Modeling and Simulation • Better Data (e.g. Higher Signal to Noise) • More Data (e.g. High Throughput)  • Closer ties between research and application • Investigation of subtle, non-linear, multi-dimensional phenomena • Statistical analysis of complex systems National Center for Supercomputing Applications

  5. The Research Process It’s just the Scientific Method… National Center for Supercomputing Applications


  6. Fg~m Assumptions Reference Data Controls… Reduction Statistics Analysis of Alternatives… The Research Process With Experimental Design… Conceptual Logical Physical National Center for Supercomputing Applications

  7. Fg~m The Research Process And Multiple, Coupled Objectives… Method Instrument High-speed camera Scientific National Center for Supercomputing Applications

  8. The Research Process Collaboration Reference Data Curation Model Validation Sub-discipline Creation Best-practice Dissemination Application Education … And Community Processes … Method Instrument Scientific National Center for Supercomputing Applications

  9. The Research Process And It’s No Longer Fg~m … Non-linear, high-dimensional, coupled, multi-scale phenomena Method Instrument Scientific National Center for Supercomputing Applications

  10. ‘Amdahl’s Law’ for Scientific Progress: ! Data production Processing power Data transfer/storage Data discovery Translation Experiment setup Group coordination Tool integration Training Feature Extraction Data interpretation Acceptance of new models/tools Dissemination of best practices Interdisciplinary communication National Center for Supercomputing Applications

  11. Dq2 Valid Range Dq1 What’s Needed to Support the Research Lifecycle? Gap Analysis Standards / Best practice Algorithms/ Services Reference Data Sensor Data Publish Share Coordinate Curate Validate Relate Discover Mine Translate Reference Extract Provenance Engineering Views Annotation Experiment Design Project Execution National Center for Supercomputing Applications

  12. Consider a Spherical Cow… • There is a class of bovine-related problems for which shape is not important • Yet shape is clearly needed in a general cow model • Should we “reach consensus” here? • Is there one ‘best’ way to map volume to height? Moo! ACME Trucking National Center for Supercomputing Applications

  13. Key Issues for Ubiquitous & Persistent CI • CI must be built before the parts are done • It must be evolvable by independent parties • It must enable coordination without central control • It must allow science to evolve / progress • No fixed domain model • Researchers/educators must be able to work in multiple communities/value chains (across CI projects) • It must convey knowledge as well as tools to end users • It must align the interests of CI funders, developers, providers, users, … National Center for Supercomputing Applications

  14. Can this be done? National Center for Supercomputing Applications

  15. Yes! • Design Principles for loosely coupled, scalable (not scaled) systems and organizations • Agile, community/science driven development processes over longer-term community/science driven design …e-Science, Semantic Grid, Web 2.0 … …intelligence at the edges… National Center for Supercomputing Applications

  16. Key Cyberenvironment Design Concepts • Explicit Representations Separating How from What: • Content (metadata, global IDs, …) • Process (workflow, provenance, …) • Virtual Organizations (policies, resources, semantics, translation) • GUI Integration (portals, rich clients, …) • … National Center for Supercomputing Applications

  17. Mid-America Earthquake Center MAEViz – an Example Cyberenvironment(Consequence-Based Risk Management for Seismic Events) Decision Support Damage Prediction Fragility Models Inventory Selection • Engineering View of MAE Center Research • Portal-based Collaboration Environment • Distributed Data/metadata Sources • Multi-disciplinary Collaboration Hazard Definition National Center for Supercomputing Applications

  18. Content Management • Whatever ‘thing’ we are talking about, we want • To know its type, • Have descriptive information so we can find and categorize it, • Be able to version it, • Specify who owns and can access it, • Define its relationships to other things, • Manage copies of it / know when you have it, • Be able to translate it, • Dynamically add new information we learn about it, • … National Center for Supercomputing Applications

  19. Content Aware Secure Enterprise Data • ARKs, DOI, LSID • WebDAV, JCR, RDF, SAM, Tupelo Desktop Public Reference Data Data/Metadata National Center for Supercomputing Applications

  20. Process Management Framework • Workflow description as a means of communicating experiment protocol • Actors built as modules, web services, grid jobs… • Process execution managed through direct calls, service calls, data transfer, events, manual processes, … • Workflow generated by applications, by example, graphically, or discovered from provenance • Execution performed using an engine with appropriate speed, reliability, availability of modules, etc. • Workflow templates and provenance records treated as sharable content (versioned, compared, documented, …) • Process descriptions captured at multiple levels of detail (scientific, mathematical, engineering, debugging, …) • Community Provenance and Process extend across workflows National Center for Supercomputing Applications

  21. Process Management Provenance Workflow Creation Workflow-by-Example Application Interface Hierarchical Workflow Scripting X=f(y) Y = f2(z) National Center for Supercomputing Applications

  22. Process Aware • Workflow, Provenance, RDF Discover Process Capture Execute Report National Center for Supercomputing Applications

  23. Virtual Organizations • Grid/portal concept for managing • Single sign-on security • access control policies • toolsets and views • data sources • processes and results • resource pools • vocabularies and models • … • Tools query VO manager to configure themselves based on VO context/policies/preferences National Center for Supercomputing Applications

  24. Pluggable User Interfaces • Portlet/Rich-Client concept, broadened to include VO configuration of • Content sources • Events • Workflow/Provenance repositories • Data models/ontologies • Translations • Portal technologies: JSR 168, Teamlets, WSRP, JSR 286, … • Rich Clients: Eclipse/OSGi, JSR 170, 283, … National Center for Supercomputing Applications

  25. SSO Group Aware • Collaboratory, Portal, … Plan, Coordinate, Share, Compare Wiki Task List Chat Document Repository Scenario Repository Training Materials National Center for Supercomputing Applications

  26. Dynamic • Plug-ins, WSRP, Provenance New Third-Party Analyses Compare, Contrast, Validate Auto-update MAEviz GIS Workflow Data Eclipse RCP Plug-in Framework National Center for Supercomputing Applications

  27. Rich, VO-oriented plug-in mechanism Third-party Plug-in Adds to menu Joins Security Context Adds to interface Maps data model Adds to provenance Adds to workflow National Center for Supercomputing Applications

  28. Environmental Observatories Rely on advances in: sensors and sensor networks at intensively instrumented sites shared by the research community cyberinfrastructure with high bandwidth to connect the sites, data repositories, and researchers into collaboratories distributed modeling platforms From USGS

  29. Observatories as a Community Focus National Center for Supercomputing Applications

  30. Knowledge Store Environmental Observatory Processes Model Dev/ Validation Observatory Operation and Evolution Research & Education Projects Documentation Coordination Recommendations On-demand Services and HPC Events Data Access Operations/Expt. Design Community Coordination/ Knowledge Creation Sensors Data Products Derived Data Products QA/QC Third-party Resources Archive Cache Cache Cache Storage Community Provisioning National Center for Supercomputing Applications

  31. Ubiquity = Supporting Scientific Discourse • Cyberenvironments represent rethinking current practice to create CI • That is enabling rather than stifling • That evolves as fast a research evolves • That connects research and practice • That empowers individuals to contribute new resources • That can be ubiquitous and persistent • That enables resource repurposing to address new questions • That opens new career paths for CI developers, data scientists, systems engineers, … National Center for Supercomputing Applications

  32. Cyberinfrastructure Challenges • How can CI increase the productivity and competitiveness of the scientific community? • How can CI developers enhance their capacity to respond to user needs more rapidly and more effectively? • How should CI technical design and organizational structures change to enable solutions at scale – as a ubiquitous, persistent infrastructure for science and engineering research and education? National Center for Supercomputing Applications

  33. CyberenvironmentsMosaic and Cyberenvironments • Mosaic • By early 1990s, the internet had a wealth of resources, but they were inaccessible to most scientists • Individual publishing • Browsing versus retrieving • See “Web 2.0 ... The Machine is Us/ing Us” • Cyberenvironments • By the early 2000’s, the internet and grid had a wealth of interactive resources, but they were inaccessible to most scientists • Individual information models • Fusion versus gathering National Center for Supercomputing Applications

  34. Mathematical, Information and Computational Sciences Division of the Office of Science Acknowledgments NCSA CET Staff NCSA Collaborators CI Community National Science Foundation/State of Illinois/ONR … and Thank You National Center for Supercomputing Applications