140 likes | 151 Views
IMAS requirements: Frameworks and Workflows. B.Guillerminet IM Design Team, CEA IRFM 8 June 2011. Outline. Session on Frameworks & Workflows Agenda & Talks Frameworks Introduction Fusion specificities IMAS Framework requirements Workflows WfMS requirements.
E N D
IMAS requirements:Frameworks and Workflows B.Guillerminet IM Design Team, CEA IRFM 8 June 2011
Outline • Session on Frameworks & Workflows • Agenda & Talks • Frameworks • Introduction • Fusion specificities • IMAS Framework requirements • Workflows • WfMS requirements
Frameworks & Workflows session • Agenda & talks • 10h40-12h30 • Introduction: IMAS requirements towards Frameworks and Workflows, B. Guillerminet (20 + 20) • SWIM Framework, W. ElWasif (ORNL) (20 + 10) • SOAF Framework, N. Hayashi (JAEA) (20 + 10) • 13h40-15h40 • Climate modeling Framework, S. Denvil (CNRS) (20 + 10) • Kepler, I. Altintas (20 + 10) • Taverna, S. Soiland-Reyes (20 + 10) • Strategies for collaborative Design and Validation, J. Courquet (CS) (20 + 10) • 16h00-18h00 • Comparison of scientific workflow engines, reported by B. Guillerminet (CEA) (20+10) • EU ITM-TF experience with Kepler, G. Falchetto (CEA) (20+10) • Experience with Simulink, Scicos and Kepler, S. Mannori (ENEA) (20+10) • Discussion (30)
Frameworks • Introduction • Frameworks are commonly used for large applications • Rules/procedures are mandatory for complex applications • A framework is a set of {rules, procedures, tools} • “a software framework is an abstraction … providing generic functionality … can be specialized …providing specific functionality” (from Wikipedia) • Framework add constraints but comes with a set of tools which simplify the use, integration … • Important non-functional requirements: ease of use, learning curve, communities support, long term … • Framework acceptance if “advantages >> constraints”
Fusion specificities • Requirements and needs from Fusion Integrated Modelling: • Many scales (space & time) : Fig. from 2EFR4K (ITER doc.) • Non linear: loops, solvers, convergences • Many interactions between various components => Challenging on Code/data coupling, scheduling, parallel & distributed computing
IMAS framework • Requirements on • Data: next session (9th June) • Software management: Version handling, regression tests => standard • Addressing: • Transverse (non-functional) • Interfaces • Physics models • WfMS • Integrated Modelling framework is a cultural shift (“Proto-FSP”): • Independent codes (specific models for a dedicated machine) => physics components within a framework using standard fusion data • It implies: • Component model • Orchestration tool • Fusion data model • Acceptation procedures • …
IMAS framework requirements • General/Transverse requirements: • Maintainability: • Modular architecture. Layer and component based architecture • Code coupling are based on the framework (calls on IMAS WfMS and data on IMAS data service). • Tight coupling for performance enhancement: • Direct calls & data transfers (broadcast => invoke, remove put/get data) • Limitations: • nearby components (same computer) • same languages (ex: F95 with gfortran version 4.1.02) • Tools?
IMAS framework requirements • General/Transverse requirements: • Extensibility: • Long term project => software evolution + new software • Small team for administration/support • User oriented: • Code/component could run within and outside the orchestration tool (“Proto-FSP”) • Code must remain independent of the framework: • Data are standardized (Fusion/ITER) • No call to the framework/WfMS inside the user’s code • Easy to use and to learn • Support legacy codes: C/C++, F90/95 … • Acceptance criteria and procedure
IMAS framework requirements • General/Transverse requirements: • Scalability: • Support parallel & distributed computing • Integrate High Performance Computers • Scaling up workflow applications: 800000 distributed tasks? • Reliability: • Regression test suite: procedures, reference data … • Able to handle major hardware/software failures • V&V: a validation mechanism: • tools to determine the uncertainties and to compare with experimental data • Able to support long simulations: • Checkpointing • HPC issues • Traceability: • A provenance mechanism. Tracking and recording the simulations
IMAS framework requirements • Interface and software management requirements: • Graphical tools for: • simulation parameters settings (including waveforms) • workflow design, monitoring • An integration mechanism for the PCS simulator • A debugging facility • Standard tools for lifecycle management: • Documentation, training • regression tests, version handling • … • Support at least C/C++, F90/95, scripting (bash+python) … • IMAS WfMS requirements: • User code must be able to run outside the WfMS • User code must not be impacted by the component model of the WfMS
Workflows • Workflow (WfMS): • “A workflow management system is a computer system that manages and defines a series of tasks within an organization to produce a final outcome or outcomes.” (from Wikipedia) • Not yet standardized => research activity: business, scientific, control • Meaning of this simple workflow? (A, B, C, D), (A, B//C, D), (A//B//C//D) … • For simple and no repetitive tasks => human is adapted • For simple workflow or repetitive tasks => scripting procedure is well suited • For complex workflows => WfMS is mandatory • Requirements on the WfMS: • General to more technical requirements • Can one size fit all? • No advantage to work with several WfMS • No tied to a single WfMS
WfMS requirements • General requirements on WfMS • Workflow development: • Graphical tool for workflow design & editing • Debugging tools • A mechanism to interactively pause/visualize/edit/resume a workflow • Ontology mechanism • Workflow parameterization: • Visualize/edit data (including waveforms) • Visualize/edit workflow parameters • Parameterize the link to the PCS simulator • Workflow execution: • Workflow monitoring and data visualization • Graphical tool or scripting language for workflow execution • Scheduling & managing components or workflows on distributed/HPC infrastructure
WfMS requirements • Scalability: Able to schedule/manage 800000 distributed tasks? • Design (from “Scaling up workflow-based applications” 2010): • complex workflows are not readable => modular workflows (Require nested workflows) + recursion • Most of the workflow is concerned with data transformations => Clear separation of data & tasks (data transformation is under the data layer responsibility) • Components are coarse-grained codes • Execution: • Distributed execution • Fault recovery mechanism • Traceability: • Provenance for simulation tracking/recording • V & V: • Parameter studies (UQ): validity range, statistical analysis
WfMS requirements • Technical requirements on WfMS • Support different models of computation (MoC) or several connected WfMS (not explicitly mentioned in D1.2): • Control flow: “Automated data processing” Use Case Usual “Business” WfMS, DAG, arrow = execution order Examples: BPEL… • Data flow: “Plasma simulation” Scientific WfMS, loops, // execution, arrow = data Examples: Kepler, Triana … • Time flow: “Equation solver” Control WfMS, differential equations, arrow = time Examples: Simulink, Scicos … • State flow: #phases (init, time step …), Fault recovery… Command & control, machine operation, arrow = event/transition • IMAS requires all these models of computations: • Either several connected WfMS • Or one WfMS supporting these MoCs