200 likes | 325 Views
e-Science & Grid Computing - introduction -. What is e-Science? What is the Grid? Grid middleware Virtual Organisations - some issues Data access & integration Metadata MSc in e-Science Technology at-a-glance. Some definitions. e-Science
E N D
e-Science & Grid Computing- introduction - • What is e-Science? What is the Grid? • Grid middleware • Virtual Organisations - some issues • Data access & integration • Metadata • MSc in e-Science Technology at-a-glance http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Some definitions • e-Science • “The large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. • “Typically, a feature of such collaborative scientific enterprises is that they will require access to very large data collections, very large scale computing resources and high performance visualisation back to the individual user scientists.” • [nesc.ac.uk] • Grid • “An infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources.” • [Foster & Kesselman, globus.org] http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
The Global Grid http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
UK SuperJANET 4/5 http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt (Links up to 2.5Gbit/s) http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Person Cell Scale, distribution, complexity Multiscale modelling of cancer http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt Multiscale modelling of the heart http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Large Hadron Collider (LHC) http://www.nesc.ac.uk/events/ahm2004/presentations/BobJones.ppt http://gridportal.hep.ph.ic.ac.uk/rtm/ http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Engine flight data London Airport Airline office New York Airport Grid Diagnostics Centre Maintenance Centre American data center European data center e-Science & engineering “A Significant factor in the success of the Rolls-Royce campaign to power the Boeing 7E7 with the Trent 1000 was the emphasis on the new aftermarket support service for the engines provided via DS&S. Boeing personnel were shown DAME as an example of the new ways of gathering and processing the large amounts of data that could be retrieved from an advanced aircraft such as the 7E7, and they were very impressed”, DS&S 2004 XTO http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt Companies: Rolls-Royce DS&S Cybula Universities: York, Leeds, Sheffield, Oxford Engine Model Case Based Reasoning http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 Signal Data Explorer
e-Science workflows A B C A: Identification of overlapping sequence B: Characterisation of nucleotide sequence C: Characterisation of protein sequence http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Grid middleware: Globus toolkit (GT) The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications, 15(3), 2001. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. I. Foster, C. Kesselman, J. Nick, S. Tuecke, Open Grid Service Infrastructure WG, Global Grid Forum, 2002. http://www.globus.org http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Grid & Web Services convergence • The definition of WSRF means that the Grid and Web services communities can move forward on a common base. http://www.globus.org http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Specifications that have/will enter a standardisation process but are not stable and are still experimental ‘WS-I+’ profile Standards that have broad industry support and multiple interoperable implementations Specifications that are emerging from standardisation process and are recognised as being ‘useful’ Web & Grid Services WS-I http://www.globus.org http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
UK National Grid Service Interfaces • Projects • e-Minerals • e-Materials • Orbital Dynamics of Galaxies • Bioinformatics (using BLAST) • GEODISE project • UKQCD Singlet meson project • Census data analysis • MIAKT project • e-HTPX project. • RealityGrid (chemistry) • Users • Leeds • Oxford • UCL • Cardiff • Southampton • Imperial • Liverpool • Sheffield • Cambridge • Edinburgh • QUB • BBSRC • CCLRC OGSI::Lite http://www.nesc.ac.uk/events/ahm2004/presentations/TonyHey.ppt http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Grid Virtual Organisations - some issues • Forming a VO dynamically • partner identification • Service Level Agreements (SLAs) • QoS, trust, reputation • Operating a VO • monitoring QoS • perturbation: coping with failures - and new opportunities! • policing: what went wrong? who’s to blame? www.conoise.org http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
response document perform document The Engine element element element Query Activity Delivery Activity Transform Activity data data credentials query data connection credentials connection Role Mapper role Data Resource Implementation role Grid Data Service http://www.ogsadai.org.uk/ http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
GDS - pipeline example <sqlQueryStatement name="statement"> <expression> select * from myTable where id=10 </expression> <resultSetStream name=“MyOutput"/> </sqlQueryStatement> Sql Query Statement <deliverToURL name="deliverOutput"> <fromLocal from=“MyOutput"/> <toURL> ftp://anon:frog@ftp.example.com/home </toURL> </deliverToURL> Deliver ToURL http://www.ogsadai.org.uk/ http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Grid data access & integration • Solutions in place to handle • heterogeneous data storage • pipelines / dataflows • access control • … within the Grid svc arch • Not specific to e-Science! • e.g. see FirstDIG project • Major issues remain, including • provenance - where did it come from, who did what to it? • data quality - living with variable-quality data (www.qurator.org) http://www.ogsadai.org.uk/ http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Publications formal/reviewed “grey” associated artefects People expert directories communities of practice Projects formal/funded working groups Metadata in e-Science • Experiment datasets • formally curated • raw/pre-processed • in vivo / in vitro / in silico • Scientific method • experiment workflow • knowledge roles: hypotheses, observations, predictions, deductions, … • Discourse & natural arguments: proof, refutation, agreement, … http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Managing scientific metadata Evidence Experiment Experiment Described In Hypothesis Publication Publication Publication Hypothesis Hypothesis Publication Disagrees With Hypothesis Hypothesis Agrees With Hypothesis e-Science metadata management platform http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
Fearlus-Gpilot project metadata schema (ontology) desktop client metadata client Globus client http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547
MSc e-Science Technologies: next… • CS5547 e-Science & Grid Computing • Grid middleware, e-Science workflow, metadata • CS5553 Intelligent Architectures • technologies for Virtual Organisations • CS5545 Data Interpretation & Communication • technologies at the data/user-scientist interface • CS5544 E-Technology Workshop • group project, with an e-Science application • CS5945 MSc Project in E-Technology • potential to do a project with user-scientists http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547