120 likes | 229 Views
This report discusses key insights and findings from various e-science and grid application projects, including the Earth System Grid and GriPhyN, among others. It highlights the necessity of collaborative platforms for accessing vast data resources and performing sophisticated analyses in fields such as environmental science, physics, and earthquake engineering. The challenges of building collaborative infrastructures, engaging communities, and deploying grid technologies are explored, emphasizing the importance of standards, interoperability, and international cooperation in advancing scientific research.
E N D
Knowledge Environments for Science:Representative Projects Ian Foster Argonne National Laboratory University of Chicago http://www.mcs.anl.gov/~foster Symposium on Knowledge Environments for Science, November 26, 2002
Comments Informed By Participation in … • E-science/Grid application projects, e.g. • Earth System Grid: environmental science • GriPhyN, PPDG, EU DataGrid: physics • NEESgrid: earthquake engineering • Grid technology R&D projects • Globus Project and the Globus Toolkit • NSF Middleware Initiative • Grid infrastructure deployment projects • Alliance, TeraGrid, DOE Sci. Grid, NASA IPG • Intl. Virtual Data Grid Laboratory (iVDGL) • Global Grid Forum: community & standards
Data Grids for High Energy Physics • Enable community to access & analyze petabytes of data • Coordinated intl projects • GriPhyN, PPDG, iVDGL, EU DataGrid, DataTAG • Challenging computer science research • Real deployments and applications • Defining analysis architecture for LHC
NEESgrid Earthquake Engineering Collaboratory U.Nevada Reno www.neesgrid.org
Galaxy cluster size distribution Chimera Virtual Data System + GriPhyN Virtual Data Toolkit + iVDGL Data Grid (many CPUs) Communities Need Not be Large:E.g., Astronomical Data Analysis Size distribution of galaxy clusters? www.griphyn.org/chimera
A “Knowledge Environment” is a System For … “Small teams” “Accessing specialized devices” “Interpersonal collaboration” “Sharing information” “Accessing services” “Enabling large-scale computation” “Integrating data” “Large communities”
It’s All of the Above: Enabling “Post-Internet Science” • Pre-Internet science • Theorize &/or experiment, in small teams • Post-Internet science • Construct and mine very large databases • Develop computer simulations & analyses • Access specialized devices remotely • Exchange information within distributed multidisciplinary teams • Need to manage dynamic, distributed infrastructures, services, and applications
Enabling Infrastructure for Knowledge Environments for Science (aka “The Grid”) “Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
Grid Infrastructure • What? • Broadly deployed services in support of fundamental collaborative activities • Services, software, and policies enabling on-demand access to critical resources • Open standards, software, infrastructure • Open Grid Services Architecture (GGF) • Globus Toolkit (Globus Project: ANL, USC/ISI) • NMI, iVDGL, TeraGrid • Grid infrastructure R&D&ops is itself a distributed & international community
Lessons Learned (1) • Importance of standard infrastructure • Software: facilitate construction of systems, and construction of interoperable systems • Services: authentication, discovery, …, … • Needs investment in research, development, deployment, operations, training • Building & operating infrastructure is hard • Challenging technical & policy issues • Requisite skills not always available • Can challenge existing organizations
Lessons Learned (2) • Importance of community engagement • “Maine and Texas must have something to communicate” • Big science traditions seem to help • Discipline champions certainly help • Effective projects often true collaborations between disciplines and computer scientistis • Importance of international cooperation • Science is international, so is expertise • Challenging, requires incentives & support
Lessons Learned (3) • Collaborative science/Grids are a wonderful source of computer science problems • E.g., “virtual data grid” (GriPhyN): data, programs, derivations as community resources • E.g., security within virtual organizations • Work in this space can be of intense interest to industry • E.g., current rapid uptake of Grid technologies