1 / 16

CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses

CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses. Zhijie Guan 1 , Alex Borchers 1 , Timothy McPhillips 2 , Shirley Cohen 3 , Mark A. Miller 1 , Ilkay Altintas 1 1 San Diego Supercomputer Center, UCSD 2 University of California, Davis

zilya
Download Presentation

CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan1, Alex Borchers1, Timothy McPhillips2, Shirley Cohen3, Mark A. Miller1, Ilkay Altintas1 1San Diego Supercomputer Center, UCSD 2University of California, Davis 3University of Pennsylvania

  2. What is a Scientific Workflow? • Combination of • data integration, analysis, and visualization steps • larger, automated"scientific process" • Mission of scientific workflow systems • Promote “scientific discovery” by providing tools and methods to generate scientific workflows • Create an extensible and customizable graphical user interface for scientists from different scientific domains • Support computational experiment creation, execution, sharing, reuse and provenance • Design frameworks which define efficient ways to connect to the existing data and integrate heterogeneous data from multiple resources • Make technology useful through user’s monitor!!!

  3. Promoter Identification Workflow Source: Matt Coleman (LLNL)

  4. A Workflow for Phylogeny Analysis

  5. Ptolemy II: A software system used for prototyping engineering system KEPLER: A platform to design and execute Scientific Workflows KEPLER = “Ptolemy II + X” for Scientific Workflows Kepler is a Scientific Workflow System www.kepler-project.org • … and a cross-project collaboration • June 2, 2006 Beta release • Builds upon the open-source Ptolemy II framework

  6. Ptolemy II Some Kepler Contributors Griddles SKIDL Resurgence SRB NLADR Contributor names and funding info are at the Kepler website!! Other contributors: - Chesire (UK Text Mining Center) - DART (Great Barrier Reef, Australia) - National Digital Archives + UCSD-TV (US) - … LOOKING

  7. A co-development in KEPLER: GEON Dataset Generation & Registration % Makefile $> ant run SQL database access (JDBC)

  8. Phylogeny Analysis Multiple Sequence Alignment Tree Visualization Local Disk Phylogeny Analysis Workflows

  9. Kepler Workflow: Actors • Actor • Encapsulation of parameterized actions • Interface defined by ports and parameters • Port • Communication between input and output data • The place where data get in/out • Model of computation • Flow of control • Sequential / parallel execution • Implementation is a framework Actor-Oriented Design

  10. Input Port: Nexus File Content Data Matrix Tree Taxa Info Output Ports: CIPRes Workflow: Actors

  11. Some actors in place for… • Generic Web Service Client and Web Service Harvester • Customizable RDBMS query and update • Command Line wrapper tools (local, ssh, scp, ftp, etc.) • Some Grid actors-Globus Job Runner, GridFTP-based file access, Proxy Certificate Generator • SRB support • Native R and Matlab support • Interaction with Nimrod and APST • Communication with ORBs through actors and services • Imaging, Gridding, Vis Support • Textual and Graphical Output • …more generic and domain-oriented actors…

  12. CIPRes Workflow Actor: GUIGen: Parameter Setting Choose the input file Run ClustalW Channel: Convey the data Get the subset of the aligned sequences Run PAUP for Tree Inference Read the tree Parse the tree Results: Display the tree

  13. CIPRes Workflows: Demo • Read Sequences  Multiple Sequence Alignment  Display the Alignment • Matrix Alignment  Tree Inference  Consensus Tree  Tree Visualization

  14. Summary • Kepler is good at: • Integrating data, programs, and computing resources • Capturing your ideas and realizing them • Supporting computational experiment creation, execution, sharing, and reuse • Quickly prototyping scientific workflows • Building streamlining applications • Visual programming language • Don’t write your application, “draw”/compose it • Cipres-Kepler package can be used to build scientific workflows for phylogenetic data analyses

  15. Future Work • Cipres-Kepler can help you • There is (always) a lot more to work on: • More actors for phylogeny analyses • Automatically generating actors based on CORBA services • Database (TreeBase) support to store large amounts of data • More computing power for large dataset processing • Need your collaboration: • Sharing experiences • Teaching each other the domain knowledge • Locating a specific problem and solving it

  16. Questions? Zhijie Guan guan@sdsc.edu 1-858-822-3620 www.sdsc.edu Cipres-Kepler Release: ftp://ftp.sdsc.edu/outgoing/borchers/cipresReleases/20060621/cipresKepler_Dist.tgz

More Related