Emerging standards for interoperable biological systems
1 / 50

Emerging Standards for Interoperable Biological Systems - PowerPoint PPT Presentation

  • Updated On :

Emerging Standards for Interoperable Biological Systems. Technology for Life: North Carolina Symposium on Biotechnology and Bioinformatics. Standards: Why do we care?. IEEE standards for plugs, outlets and wiring – I can buy an appliance and use it ( most of the time )

Related searches for Emerging Standards for Interoperable Biological Systems

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Emerging Standards for Interoperable Biological Systems' - molimo

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Emerging standards for interoperable biological systems l.jpg

Emerging Standards for Interoperable Biological Systems

Technology for Life: North Carolina Symposium on Biotechnology and Bioinformatics

Dr. Marty McClelland

Standards why do we care l.jpg
Standards: Why do we care?

  • IEEE standards for plugs, outlets and wiring – I can buy an appliance and use it ( most of the time )

  • Any international traveler will tell you that standards vary around the world

Dr. Marty McClelland

Without standards l.jpg
Without Standards -

  • Custom builds by experts

  • Build once – use once

  • Need expertise in specific domain

  • Expensive

  • Most of us – still using candles

Dr. Marty McClelland

Standards and software l.jpg
Standards and Software

  • World Wide Web

  • Plug and play

  • Plug-in / modular components

  • XML: Extensible Markup Language

  • Web Services

  • Federated Search

  • Grid Services

Dr. Marty McClelland

My standards journey l.jpg
My Standards Journey

  • Middleware to integrate learning systems with enterprise resource planning systems

  • IMS / IEEE learning technology standards – learning object metadata

  • National Science Digital Library – STEM LOM repository

  • NCCU BBRI Cardiovascular Study – similar issues

Dr. Marty McClelland

Bioinformatics community l.jpg
Bioinformatics Community

  • Embraced open source

  • Philosophy of sharing of data and tools

  • Community involvement yields foundation for standards development

Dr. Marty McClelland

Emerging standards l.jpg
Emerging Standards

  • tools/middleware – web services for harvesting – federated searches

  • grid computing

  • ontologies – developing controlled vocabularies

  • analysis– standards for sharing results– e.g. microarray analysis

  • models- Systems Biology – standards for interchange

Dr. Marty McClelland

Sharing data tools and middleware l.jpg
Sharing Data, Tools, and Middleware

  • XML, go to http://www.w3.org/XML/

  • Specifications for data interchange in biology applications (XML schemas)

  • Web services

    • Define WSDL for biology applications

Dr. Marty McClelland

Xml for data exchange l.jpg









CDISC, and


XML for data exchange

Dr. Marty McClelland

Virginia bioinformatics institute l.jpg
Virginia Bioinformatics Institute

  • toolbus

  • PathPort

  • Middleware for web services

  • query multiple databases

  • facilitate decision making and data interpretation

  • http://staff.vbi.vt.edu/pathport/services/

Dr. Marty McClelland

Biomoby l.jpg

  • simple extensible protocols

  • Web services for interoperable databases

  • http://biomoby.org/

Dr. Marty McClelland

Grid computing l.jpg
Grid Computing

  • user authentication and authorization ( like X.509 certificates )

  • Open Grid Computing Environment (OGCE) portal toolkit

  • Open Grid Services Architecture , OGSA

  • Globus Toolkit

Dr. Marty McClelland

Grid applications l.jpg
Grid Applications

  • iNquiry – commercial product

  • NC BioGrid prototype / planning stages

  • statewide Bioinformatics Portal being created by the University of North Carolina at Chapel Hill

  • GridNexus project

Dr. Marty McClelland

Ontologies l.jpg

  • Controlled vocabulary

  • Crosswalks between controlled vocabularies

  • Interoperability

  • Browse and search services across disparate repositories

  • www.geneontology.org

Dr. Marty McClelland

Data analysis l.jpg
Data Analysis

  • MIAME, minimal information for the annotation of a microarray experiment

  • http://mged.sourceforge.net/ontologies/index.php

Dr. Marty McClelland

Systems biology l.jpg
Systems Biology

  • Historically – many custom, small scale models with little reuse

  • Goal of Systems Biology is to construct the system with modular models where data can be supplied via web service queries to databases

Dr. Marty McClelland

Model integration l.jpg
Model Integration

  • Biology Workbench (SBW) strives to support model integrations through

  • Systems Biology Markup Language ( SMBL) – XML to represent biochemical networks – common framework to document models

  • SBW provides framework for interoperation across heterogeneous modeling tools http://sbml.org/index.psp

Dr. Marty McClelland

Implications l.jpg

  • expose databases with web services

  • construct queries to locate the data

  • standards for grid services

  • community developed XML schemas for sharing biological data

Dr. Marty McClelland

Gridnexus l.jpg

Dr. Marty McClelland

Uncw grid initiative gridnexus l.jpg
UNCW Grid Initiative: GridNexus

  • The UNCW Grid Computing Project is a two-year collaborative project among a multi-discipline, multi-investigator core research team at UNCW and several discipline-focused researchers at partner institutions: NCSU, WCU, NCCU, ECU, and CFCC. The research areas and institutional interests of this project are:

  • Advanced Grid Software Development (UNCW)

  • Computational Chemistry (UNCW and ECU)

  • Bioinformatics (UNCW, NCSU, and NCCU)

  • Combinatorics (UNCW)

  • Business Computing (UNCW and NCCU)

  • Education and Training (UNCW, WCU, CFCC)

  • This project proposes to develop a Grid interface that is easy-to-use and may be used by a wide-range of applications and users. We have developed an innovative graphical user interface (GUI) for grid applications. In particular, we introduced a new scripting language (JXPL) designed for web-based services, a GUI for creating scripts, and have demonstrated the use of these tools with grid services.

Dr. Marty McClelland

Gridnexus21 l.jpg

  • This initiative grew in part out of a need for HPC resources following the closure of the NCSC in June 2003, coupled with the availability of faculty with software programming expertise and others with computing applications that could benefit from use of a Grid.

  • The UNC-OP funded UNCW’s proposal for $557,634 over two years to develop Grid portals (GUI middleware to allow users to access software on computers on a Grid).

Dr. Marty McClelland

Resources of uncw grid l.jpg
Resources of UNCW Grid

  • Beowulf cluster – 16 PIII processors in Computer Sciences Department

  • Fire and FireDev servers plus disc storage devices

  • PQS Quantum Cube – 8 cpu cluster with PQS and Gaussian 03 computational chemistry software, plus TCP-Linda environment.

  • An 8 processor IBM blade cluster with 0.5 tB disk storage will be added soon.

  • Other computers may be added, including the possibility of using all computing lab computers, or possibly even all faculty/staff computers (when not in use).

Dr. Marty McClelland

Gridnexus23 l.jpg

  • The objective is to make accessing HPC resources (wherever they may be located) easy to scientists who are not computer savvy.

  • Most computation involves doing various mathematical operations on a dataset.

  • A GUI approach is employed, in which the user, after a single login that checks authentication and authorization, can create a ‘workflow’ of functions/operations graphically by connecting boxes dragged from a series of lists of options, then applying that series of steps to a dataset.

  • Such a ‘workflow’ can be saved for subsequent application to another dataset.

Dr. Marty McClelland

Gridnexus24 l.jpg

  • Job submission: Ideally in a grid, the grid middleware should select the ‘best’ resource – those computers that are available, capable, and have the software needed to handle the job.

  • The user need not select – nor know – where the computation is taking place. In fact, the job may even be passed from one computer to another for various aspects of the calculation.

  • The output is returned to the user’s workstation or account, rather than the user having to access and download the output file from a remote computer.

Dr. Marty McClelland

Gridnexus25 l.jpg

  • GridNexus is a GUI that allows the user to create/edit/run workflows

  • Based on Ptolemy II http://ptolemy.eecs.berkeley.edu/ptolemyII. Ptolemy provides the GUI and workflow features. We have extended it to provide the functionality we want (JXPL and GridServices)

  • Release 1.0.0 download available www.gridnexus.org

Dr. Marty McClelland

Getting started l.jpg
Getting Started

  • The right frame is the palette for building workflows

  • The upper left frame provides the library of modules

  • The lower left is a thumbnail of the entire workflow

Dr. Marty McClelland

The basics l.jpg
The Basics

  • Sources produce data without needing input

  • Sinks consume data but may have side effects (such as displaying results)

  • All workflows must start with sources and end with sinks

Dr. Marty McClelland

Simple example 1 l.jpg
Simple Example 1

  • Click and drag the “Const” source to the workflow.

  • Click and drag the “JxplDisplay” sink to the workflow

Dr. Marty McClelland

Simple example 129 l.jpg
Simple Example 1

  • Double click on the Const module

  • Change its value to 10

  • Click commit

  • The new value is shown on the icon

Dr. Marty McClelland

Simple example 130 l.jpg
Simple Example 1

  • Input ports are on the left-hand side and output ports are on the right-hand side of each module

  • Click and drag from the output port of the Const module to the JxplDisplay

Dr. Marty McClelland

Simple example 131 l.jpg
Simple Example 1

  • A link (or relation) is created between the two modules

  • The output of Const is consumed by the JxplDisplay

Dr. Marty McClelland

Simple example 132 l.jpg
Simple Example 1

  • Click on the run button ( )

  • The JxplDisplay evaluates the input and produces a display window to show the results.

  • Notice the output is in XML (actually JXPL)

Dr. Marty McClelland

Simple example 2 l.jpg
Simple Example 2

  • Transformers are modules that take input, transform it, and produce new output

  • This example computes the express: (23 + 6) * -2

Dr. Marty McClelland

Simple example 234 l.jpg
Simple Example 2

  • The Multiplication module takes the result of the addition (its first input) and multiplies that by -2 (its second input)

  • The result is consumed by JxplDisplay

Dr. Marty McClelland

What s going on l.jpg
What's Going On?

  • The workflow is not actually performing the operations. Instead it is creating a script (JXPL) that, when executed, produces the result

  • The JxplDisplay is evaluating the script and displaying the results

Dr. Marty McClelland

What s going on36 l.jpg
What's Going On?

  • Double-click on the JxplDisplay and deselect the “Evaluate Jxpl” parameter

  • This parameter tells JxplDisplay whether or not to evaluate the script that is generated

Dr. Marty McClelland

What s going on37 l.jpg
What's Going On?

  • Now when we run it, we see the actual script that is produced by the workflow

  • The script is written in XML using a language developed at UNCW called JXPL

Dr. Marty McClelland

A little bit about jxpl l.jpg
A Little Bit about JXPL

  • JXPL is based on LISP

  • The corresponding LISP to the JXPL on the right looks like:

    (* (+ (23 6) -2)

Dr. Marty McClelland

A little bit about jxpl39 l.jpg
A Little Bit about JXPL

  • Why?

    • XML is used to transport data between web/grid services

    • XML opening/closing tags <-> LISP opening/closing parens

    • Everything is either an atom or a list (functions, Data Structures)

Dr. Marty McClelland

Gridnexus and jxpl with grid services l.jpg
GridNexus and JXPL with Grid Services

  • create workflows that can make use of web and grid services

  • implement primitives in JXPL that are generic web and grid clients

  • inspect the WSDL of the service to determine its interface

Dr. Marty McClelland

Gsclient module l.jpg
GSClient module

  • GSClient module : whereby the user can specify the factory URL, the instance name of the service, the stub class, and the port type

  • primitive uses the OGSIServiceGridLocator to find the grid service and invoke the appropriate method with the arguments

Dr. Marty McClelland

Gridnexus and ogsa dai l.jpg
GridNexus and OGSA-DAI

  • OGSA-DAI Grid Data Services are designed so that the output of one can be delivered to another

  • GridNexus allows non-programmers to create JXPL to control GDS interaction in a graphical environment

Dr. Marty McClelland

Build the library l.jpg
Build the Library

  • Identify tasks in scientific workflows

  • Investigate existing open source modules for possible integration with GridNexus

  • Design for reuse incorporating appropriate standards

  • Implement library module in GridNexus

Dr. Marty McClelland

Gridnexus47 l.jpg

  • Release 1.0.0 download available www.gridnexus.org

Dr. Marty McClelland

Acknowledgments l.jpg

  • UNC-OP for funding the UNCW Grid Initiative Proposal:

    “Fostering Undergraduate Research Partnerships through a Graphical User Environment for the North Carolina Computing Grid,” Dr. Ron Vetter, PI

    • Co-PIs:Dr. Rebecca S. Boston, NCSU; Dr. Anthony Wilkinson, WCU; Dr. Marilyn McClelland, NCCU; Dr. Libero Bartolotti, ECU; Ms. Judy Porter, CFCC.

    • UNCW Participants: Computer Science: Dr. Ron Vetter, Dr. Clayton Ferner, Dr. David Berman, and Dr. Tom Hudson. Information Technology Systems: Dr. Bob Tyndall and Mr. Bobby Miller. Mathematics and Statistics: Dr. Jeff Brown. Chemistry and Biochemistry: Dr. Ned H. Martin. Biological Sciences: Dr. Ann Stapleton Information Systems and Operations Management: Dr. Tom Janicki.

    • UNCW Computer Science students working on the Chemistry portal: Tristan Carland, Jerry Martin, Andrew Martin

Dr. Marty McClelland

Acknowledgments49 l.jpg

  • Grid Computing: Harnessing Underutilized Resources Dr. Ned H. Martin

  • GridNexus UNCW GUI for Workflow Management Dr. Clayton Ferner

  • GridNexus: A Grid Services Scientific Workflow SystemJeffrey L. Brown, Clayton S. Ferner, Thomas C. Hudson, Ann E. Stapleton, Ronald J. Vetter, Andrew Martin, Jerry Martin, Allen Rawls, William J. Shipman, and Michael Wood

Dr. Marty McClelland