a task pipelining framework for e science workflow management systems l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Task Pipelining Framework for e-Science Workflow Management Systems PowerPoint Presentation
Download Presentation
A Task Pipelining Framework for e-Science Workflow Management Systems

Loading in 2 Seconds...

play fullscreen
1 / 15

A Task Pipelining Framework for e-Science Workflow Management Systems - PowerPoint PPT Presentation


  • 290 Views
  • Uploaded on

A Task Pipelining Framework for e-Science Workflow Management Systems Hyeong S. Kim (hskim@dcslab.snu.ac.kr) In Soon Cho (ischo@dcslab.snu.ac.kr) Heon. Y. Yeom (yeom@snu.ac.kr) Dept. of Computer Science and Engineering Seoul National University Outline Introduction Motivation HVEM Grid

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Task Pipelining Framework for e-Science Workflow Management Systems' - Sophia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a task pipelining framework for e science workflow management systems

A Task Pipelining Framework for e-Science Workflow Management Systems

Hyeong S. Kim (hskim@dcslab.snu.ac.kr)

In Soon Cho (ischo@dcslab.snu.ac.kr)

Heon. Y. Yeom (yeom@snu.ac.kr)

Dept. of Computer Science and Engineering

Seoul National University

DCSLab, SNU

outline
Outline
  • Introduction
  • Motivation
    • HVEM Grid
  • Proposing System
    • PIPE File System
  • Conclusion

DCSLab, SNU

introduction
Introduction
  • Complex Scientific Workflow
    • Input/output data are becoming larger and larger
    • In most of the scientific workflows, we cannot ignore the time consumed int the intermediate data movement which possesses high portion of running time
  • Our Focus
    • Staging is our primary concern.
    • We seek a way to pipeline multiple interconnected tasks
    • Applications can benefit if the output of the prior task can be used by the posterior task once the data gets ready
  • In this paper
    • We consider several components to enable task pipelining
    • As a reference implementation, we propose PFS that supports various legacy applications without modification to the existing applications.
    • Our system can also be described in a workflow specification and thus, a user is able to construct a task pipelining framework without any further efforts except presenting a workflow specification for the PFS.

DCSLab, SNU

motivating application hvem grid
Motivating Application – HVEM Grid
  • HVEM (High Voltage Electronic Microscope) financially supported by the Ministry of Science and Technology in Korea.
  • HVEM has been installed in October, 2003, at the headquarter of Korea Basic Science Institute (KBSI), a nation user facility.
  • The main purpose is to offer a leading-edge analytical technology to researchers in diverse scientific fields.

DCSLab, SNU

hvem grid system
HVEM Grid System
  • High Voltage Electron Microscope (HVEM) Grid system is a powerful tool designed upon the concepts of Grid and Web Service
    • To control instruments remotely
    • To manage and control 3-D processing of images
    • To store data automatically

DCSLab, SNU

image processing g render
Image Processing (G-Render)
  • Grid-based image processing system
  • 3-step image processing service:
    • 1) Image preprocessing
    • 2) Image alignment
    • 3) Tomogram generation
    • 4) Segmentation
  • Enabling high-performance image processing by utilizing the Grid to acquire unlimited computing power

DCSLab, SNU

grid workflow management system
Grid Workflow Management System

Grid users

Grid Workflow Application Modeling & Definition Tools

Workflow Design & Definition

Grid Information Services

Build Time

Grid Workflow Specification

Resource Info Service

Run Time

Application Info Service

Grid Workflow Enactment Service

Workflow Execution & Control

Workflow Scheduling

Data Movement

Fault Management

Grid Middleware

Interaction with Grid Resources

Grid Resources

DCSLab, SNU

design consideration
Design Consideration
  • Application-transparency
    • Supporting various legacy applications
  • Flexibility
    • Providing a general solution
  • Usability

DCSLab, SNU

components required
Components Required
  • Workflow engine
    • If sufficient amount of data available, run next task immediately
  • Storage manager
    • Manage storage
    • Logical to physical mapping
    • Directory management
    • Advertise data availability to workflow engine
  • Physical storage
    • Store input/output files
    • Handle read/write operations

DCSLab, SNU

reference implementation
Reference Implementation
  • PIPE File System (PFS) consists of
    • PFS Manager (storage manager)
    • PFS Data Servers (physical storage)
    • PFS Library (user-transparency by FUSE)

DCSLab, SNU

pfs manager
PFS Manager
  • Resource Management
    • Storage management
    • PFS Data Server Maintenance
    • Logical to physical file mapping
    • Directory management
  • Single access point for the clients
    • Client can mount the PIPE File System manipulate the files as usual
  • Schedule Triggering
    • Advertise data availability to workflow engine

DCSLab, SNU

pfs data servers
PFS Data Servers
  • Physical File Management
    • Store input/output files in its local storage
    • Serve read/write operations

DCSLab, SNU

pfs library
PFS Library
  • User-level Library for Application
  • Two components
    • FUSE kernel module
      • Intercept I/O system call
      • Redirect the system call to the PFS Client
    • PFS Client
      • Interpret the I/O system call
      • Redirect the command to PFS Manager or PFS Data Server
      • Maintains the open file list

DCSLab, SNU

integration
Integration

Workflow scheduler

enactment

enactment

metadata

metadata

p

p

PFS Manager

fuse

fuse

data

data

PFS Data Server

DCSLab, SNU

conclusion
Conclusion
  • We propose a task pipelining framework
  • Our system provides task pipelining in a form of a simple distributed file system
  • Triggering interface is used to enact next task

DCSLab, SNU