Hp see project and the hpc bioinformatics life science g ateway
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

HP-SEE project and the HPC Bioinformatics Life Science g ateway PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

HP-SEE project and the HPC Bioinformatics Life Science g ateway. M. KOZLOVSZKY Obuda University. Overview. The HP-SEE project HP-SEE Life Sciences Virtual Community HP-SEE Bioinformatics Life Science gateway Sequence alignment a pplications  workflow based online bioinformatics services

Download Presentation

HP-SEE project and the HPC Bioinformatics Life Science g ateway

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Hp see project and the hpc bioinformatics life science g ateway

HP-SEE project and the HPC Bioinformatics Life Science gateway

M. KOZLOVSZKY

Obuda University

The HP-SEE initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no. 261499


Overview

Overview

The HP-SEE project

HP-SEE Life Sciences Virtual Community

HP-SEE Bioinformatics Life Science gateway

Sequence alignment applicationsworkflow based online bioinformatics services

Working with workflows/gUSE

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20122


Pan european e infrastructures vision

Pan-European e-Infrastructures vision

e-Science Collaborations

  • The Research Network infrastructure provides fast interconnection and advanced services among Research and Education institutes of different countries

  • The Research Distributed Computing Infrastructure (Grid, HPC) provides a distributed environment for sharing computing power, storage, instruments and databases through the appropriate software (middleware) in order to solve complex application problems

  • This integrated environment is called electronic infrastructure (eInfrastructure) allowing new methods of global collaborative research - often referred to as electronic science (eScience)

  • The creation of the eInfrastructure is one of the key objectives to facilitate building of the European Research Area

DCI Infrastructure

Network Infrastructure

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20123


Context the model converged communication service infrastructure for south east europe

Context: the Model -Converged Communication & Service Infrastructure for South-East Europe

Seismology, Meteorology, Environment

Comp physics,

Comp chem, Life sciences

User / Knowledge layer

HP-SEE

SEE-GRID & EGI

SEE-LIGHT & GEANT

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20124


Context timeline and funding

Context: Timeline and funding

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20125


Hp see project

HP-SEE: Project

  • Contract : RI-261499

  • Project type: CP & CSA

  • Call: INFRA-2010-1.2.3: VRCs

  • Start date: 01/09/2010

  • Duration: 24 + 9 months

  • Total budget: 3 885 196 €

  • Funding from the EC: 2 100 000 €

  • Total funded effort, PMs: 539.5

  • Web site: www.hp-see.eu

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20126


Hp see partnership

HP-SEE: Partnership

Contractors (14)

Third Party / JRU mechanism used

associate universities / research centres

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20127


Hp see project objectives

HP-SEE: Project Objectives

  • Objective 1 – Empowering multi-disciplinary virtual research communities

  • Objective 2 – Deploying integrated infrastructure for virtual research communities

    • Including a GEANT link to Southern Caucasus

  • Objective 3 – Policy development and stimulating regional inclusion in pan-European HPC trends

  • Objective 4 – Strengthening the regional and national human network

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20128


T he hp see life science v r c and its objectives

The HP-SEE Life Science VRC and its objectives

Main goal:

Utilize the combined HPC resources with regional needs coming from the life/bioscience communities, fostering the research process in the field within the region with the help of the large-scale high availability infrastructure, and facilitate the cooperation between the sparsely distributed life science research centres.

Data and limitations

The Life Sciences domain has been revolutionized by advances in both computer hardware andsoftware algorithms.

Assembling the Human Genome

Gene-expression chips to understand cellular processes

Exponential growth in the amount ofpublicly available genomic data.

GeneBank

Traditional database approachesare no longer sufficient for rapidly performing life science queries involving the fusion of datatypes.

Existing computational tools were created by experimentalists dealing with data sets that were miniscule in comparison to those available today. As a result, software that was once perfectly adequate now performs slowly or is incapable of successful analysis on traditional computational platforms.

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.20129


Accessible infrastructure

Accessible infrastructure

  • HP-SEE Supercomputing infrastructure

  • SEE-GRID-SCI Grid infrastructure

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201210


Hp see s ls applications

HP-SEE’s LS Applications

7 applications from 5 countries

Greece:

Searching for novel miRNA genes and their targets(miRs)

Network models of short and long term memory (CMSLTM)

Montenegro:

DNA Multi-core Analysis (DNAMA)

Hungary:

Deep sequencing for short fragment alignment (DeepAligner) - gUSE & workflow based

In-silico Disease Gene Mapper (DiseaseGene) - gUSE & workflow based

Georgia:

Modeling of some biochemical processes with the purpose of realization of their thin and purposeful synthesis (MSBP)

Armenia:

Molecular Dynamics Study of Complex systems(MDSCS)

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201211


Why guse ws pg rade

Why gUSE/WS-PGRADE

  • Infrastructure

    • HP-SEE infrastructure

      • Based on gLite and Arc as middleware

      • Authentication procedures are painfull (as usual)

    • Interoperabilty with grids is a plus

  • Application

    • Workflow like process with embedded (legacy) applications

    • Restricted input parameter sets for the algorithms

    • Service like operation

    • Portal features for a community

  • Knowledge,licensing & support

    • Open source software environment needed

    • Knowledge transfer required for the application specific modules

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201212


Hp see bioinformatics escience gateway

HP-SEE Bioinformatics eScience Gateway

  • HP-SEE Bioinformatics eScience Gateway hosted at Obuda University, operated by MTA SZTAKI.

  • gUSE+WS-PGRADE (v3.3.2) - Liferay based

  • SEE region’s supercomputing & grid infrastructure used

  • Accessible at: http://ls-hpsee.nik.uni-obuda.hu:8080/liferay-portal-6.0.5

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201213


Architecture and application porting steps

Architecture and application porting steps

Unified porting steps of the applications:

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201214


Deepaligner deep sequencing for short fragment alignment

DeepAligner-Deep sequencing for short fragment alignment

  • Description& Objectives

    Mapping short fragment reads to open-access eukaryotic genomes is solvable by a group of algorithms (BLAST, BWA, PatternHunter, and other sequence alignment tools – BLAST /mpiblast or scalablast/ is one of the most frequently used tool in bioinformatics and the others are relative new fast light-weighted tools that aligns short sequences. Local installations of these algorithms are typically not able to handle such problem size therefore the procedure runs slowly, while web based implementations cannot accept high number of queries. The HP-SEE infrastructure allows accessing massively parallel architectures and the sequence alignment code isdistributed free for academia.

  • Result

    Online workflow based short sequence alignment service

  • Impact

    Freely available service/code for large scale short sequence alignment

  • Collaborations

  • Hungarian Bioinformatics Association, Semmelweis University

  • HP-SEE infrastructure used: Hungarian HPC, NIIF’s supercomputing sites

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201215


Deepaligner deep sequencing for short fragment alignment contd

DeepAligner-Deep sequencing for short fragment alignment (contd.)

Small scale launch (Home cluster): PBS/Linux Cluster, at the Obuda University – John von Neumann Faculty of Informatics.

Activity and technical assistance in pre-production stage: Technical assistance was provided by MTA SZTAKI and NIIF.

Porting: Application was ported using(Perl/C). Workflow and GUI was created for the application by Obuda University.

Benchmarking

Scaled from 32 cores to 96 cores (MPI).

DeepAligner Status

The online service is using two from NIIF’s supercomputing infrastructure (Budapest site and Szeged site).

Foreseen activities: Parameter assignments optimization of the GUI, more scientific publications about short sequence alignment. Further scaling is planned with performance analysis.

More information: http://hpseewiki.ipb.ac.rs/index.php/DeepAligner

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201216


Development working on guse ws pgrade

Development & working on gUSE/WS-PGRADE

  • Pros

    • Close collaboration and useful support (pros)

      • ARC middleware connector was developed from scratch by MTA SZTAKI on request

      • ASM and ARC submitter related bugs have been found and reported

      • Helpful and skilled support & development team

  • Cons

    • ARC middleware problems (internal) hard to find

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201217


Future plans

Future plans

  • Additional plug-in like online bioinformatics services

    • More sequence alignment workflows

    • More sequence multiple alignment workflows

    • Sequence database quality measurement workflows

  • Open up the gateway for users outside SEE region

    Thank you for you attention!

    Questions?

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201218


Guse ws pgrade architecture

gUSE/WS-PGRADE architecture

DeepAligner

DiseaseGene

ASM

Application specific Module

WS-PGRADE

Summer School on Workflows and Gateways for Grids and Clouds 2012– Budapest ,Hungary 2-6.07.201219


  • Login