Genomic data analysis services available for pl grid users
Sponsored Links
This presentation is the property of its rightful owner.
1 / 11

Genomic Data Analysis Services Available for PL-Grid Users PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Genomic Data Analysis Services Available for PL-Grid Users. Tomasz Waller, Tomasz Gubała , Kazimierz Murzyn. Academic Computer Centre Cyfronet AGH, Klaster LifeScience Kraków , Recent Advances in Omics Research, Kraków, October 2014.

Download Presentation

Genomic Data Analysis Services Available for PL-Grid Users

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Genomic Data Analysis ServicesAvailable for PL-Grid Users

Tomasz Waller, Tomasz Gubała, Kazimierz Murzyn

Academic Computer Centre Cyfronet AGH,


Recent Advances in Omics Research, Kraków, October 2014

ACC Cyfronet AGH andPL-Grid Infrastructure

Academic Computer Centre Cyfronet AGH

  • Established in 1973 (40 years of experience)

  • Provides network, computational power and data storage capabilities for Polish science

    • ~374 TFlops (zeus, 175@top500), 2.5 PB (disks)and 3.5 PB (tapes)

    • 1.7 PFlops (prometheus) with 10 PB of disks,expected first half of 2015

    • Regular and bigmem nodes, vSMP, GPGPU, FPGA,MPI over Infiniband

    • Details:

      PL-Grid Infrastructure for Polish science

  • Five computing centers with Cyfronet asthe consortium leader

  • Total: ~588 TFlops and ~5.6 PB (disks) butsoon to grow considerably (see above)

  • Available free of charge to all Polish scientistsand their foreign collaborators

  • Details:

Using PL-Grid Infrastructure

  • Register at

    • User verification process based on Polish OPI number

    • Assistants and foreigners are confirmed by Polish PIs

    • Variety of basic and higher level services available after login

  • Local SSH access, cloud computing, middlewares

  • Considerable library of installed applications

    • GATK, MACS, SAMTools, Picard, TopHat, Bowtie, (p)BWA, R/Bioconductor, AutoDock/AutoGrid, BLAST, Clustal, CPMD, Gromacs, NAMD, Matlab, Mathematica …

    • Free to compile and install own applications using the shell login

    • Possibility to use own commercial licenses on HPC resources

  • Specific services dedicated to the Life Science domain

DNA Microarray Integromics Analysis Platform (1/2)

  • For people who perform biological investigations using DNA microarrays

  • Goal: help to analyze gene expression information and correlate it with other clinical data

  • Analyses available now: normalization, clustering, SAM, T-test, GO-based enrichment, ANNs, PCA, panel filtering

  • ’Integromics’ analyses in ’beta’ (testing) stage

    • CCA, PLS (gene expression and lipidomics)

    • Roleswitch, TargetScore (gene expression and miRNA)

  • Still in continuous development (Pathways, EBI export etc.)

  • Supported models: some Affymetrix, AgilentSurePrint (addingsupport for others is possible, in case of demand)

DNA Microarray Integromics Analysis Platform (2/2)

  • Notable features

    • Integration with EBI ArrayExpress (import, MIAME)

    • Sharing experiments with others

    • Importing own data for further analysis

    • Supported languages: PL, EN

  • Manual:

  • Cooperation

    • Jagiellonian University Medical Collage, Kraków

    • Medical University of Silesia, Katowice

    • Institute of Oncology, Gliwice

Agilent GeneSpring GX

  • RDP:

  • Used with Windows Remote Desktop

  • Integrated with the DNA Integromics Platform for uniform microarray files management

  • 5-year, single-seat license for all registered Polish scientists

  • Manual:

Galaxy NGS Server (1/4)

Galaxy NGS Server (2/4)

”Galaxy is an open, web-based platform for data intensive biomedical research.”

  • Goal: deploy high-performance, high-throughput NGS data analysis solution on top of HPC resources for PL-Grid users

  • Needs a lot of adjustments and in-house add-on development

  • Work started 12.2013, and still at a beta stage…  - but accessible to anyone willing to test and to help

  • Planned integrated tools (list not closed): GATK, SAMtools, Bowtie, TopHat, BWA, bedtools, Cufflinks, Picard, SnpEff/SnpSift, Flexbar, FastQC, MACS

  • Targeted platforms: Illumina *Seq, Ion Proton, Roche 454

Galaxy NGS Server (3/4)

  • Notable features

    • Full integration with Zeus cluster and disk arrays

    • PBS and MQ system for effective job queuing

    • Secured environment (open for all PL-Grid users, not ”public”)

    • All major Galaxy features (history, sharing, viewers)

  • Well documented workflows designed by NGS experts

    • Basics (alignment and quality control, trimming, filtering)

    • DNA-Seq, RNA-Seq, variant calling, SNP calling, methylation, exome analysis with annotations

  • Manual:

  • Cooperation

    • Institute of Pharmacology, Polish Academy of Sciences, Kraków

    • OMICRON, Jagiellonian University Medical Collage, Kraków

    • National Research Institute of Animal Production, Kraków-Balice

Galaxy NGS Server (4/4)

  • Current challenges

    • Some security issues in the Galaxy code prevent the production deployment

    • Cluster integration is there, yet rather unstable and prone to fail (quite an intricate contraption, it is)

    • Broad variety of integrated tools and wrappers does not help

  • Call to action – who is needed

    • Users: the bigger the community, the easier to make us visible

    • Early adopters: tell us what you need, help us test and integrate the tools and workflows you use

    • Programmers: if you’d like to help us bring a dedicated HPC-powered Galaxy for Polish scientists, any assistance is greatlyappreciated

    • Contact:

Links, Contact, Partners

  • These resources, services and tools (and much more) are available after registering to PL-Grid

  • PL-Grid User Manual

    • (PL)

    • (EN)

  • Questions, problems, requests about PL-Grid


  • Contact for LifeScience domain services


  • Login