Genomic data analysis services available for pl grid users
1 / 11

Genomic Data Analysis Services Available for PL-Grid Users - PowerPoint PPT Presentation

  • Uploaded on

Genomic Data Analysis Services Available for PL-Grid Users. Tomasz Waller, Tomasz Gubała , Kazimierz Murzyn. Academic Computer Centre Cyfronet AGH, Klaster LifeScience Kraków , Recent Advances in Omics Research, Kraków, October 2014.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Genomic Data Analysis Services Available for PL-Grid Users' - hammett-maxwell

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Genomic data analysis services available for pl grid users

Genomic Data Analysis ServicesAvailable for PL-Grid Users

Tomasz Waller, Tomasz Gubała, Kazimierz Murzyn

Academic Computer Centre Cyfronet AGH,


Recent Advances in Omics Research, Kraków, October 2014

Acc cyfronet agh and pl grid infrastructure

ACC Cyfronet AGH andPL-Grid Infrastructure

Academic Computer Centre Cyfronet AGH

  • Established in 1973 (40 years of experience)

  • Provides network, computational power and data storage capabilities for Polish science

    • ~374 TFlops (zeus, [email protected]), 2.5 PB (disks)and 3.5 PB (tapes)

    • 1.7 PFlops (prometheus) with 10 PB of disks,expected first half of 2015

    • Regular and bigmem nodes, vSMP, GPGPU, FPGA,MPI over Infiniband

    • Details:

      PL-Grid Infrastructure for Polish science

  • Five computing centers with Cyfronet asthe consortium leader

  • Total: ~588 TFlops and ~5.6 PB (disks) butsoon to grow considerably (see above)

  • Available free of charge to all Polish scientistsand their foreign collaborators

  • Details:

Using pl grid infrastructure

Using PL-Grid Infrastructure

  • Register at

    • User verification process based on Polish OPI number

    • Assistants and foreigners are confirmed by Polish PIs

    • Variety of basic and higher level services available after login

  • Local SSH access, cloud computing, middlewares

  • Considerable library of installed applications

    • GATK, MACS, SAMTools, Picard, TopHat, Bowtie, (p)BWA, R/Bioconductor, AutoDock/AutoGrid, BLAST, Clustal, CPMD, Gromacs, NAMD, Matlab, Mathematica …

    • Free to compile and install own applications using the shell login

    • Possibility to use own commercial licenses on HPC resources

  • Specific services dedicated to the Life Science domain

Dna microarray integromics analysis platform 1 2

DNA Microarray Integromics Analysis Platform (1/2)

  • For people who perform biological investigations using DNA microarrays

  • Goal: help to analyze gene expression information and correlate it with other clinical data

  • Analyses available now: normalization, clustering, SAM, T-test, GO-based enrichment, ANNs, PCA, panel filtering

  • ’Integromics’ analyses in ’beta’ (testing) stage

    • CCA, PLS (gene expression and lipidomics)

    • Roleswitch, TargetScore (gene expression and miRNA)

  • Still in continuous development (Pathways, EBI export etc.)

  • Supported models: some Affymetrix, AgilentSurePrint (addingsupport for others is possible, in case of demand)

Dna microarray integromics analysis platform 2 2

DNA Microarray Integromics Analysis Platform (2/2)

  • Notable features

    • Integration with EBI ArrayExpress (import, MIAME)

    • Sharing experiments with others

    • Importing own data for further analysis

    • Supported languages: PL, EN

  • Manual:

  • Cooperation

    • Jagiellonian University Medical Collage, Kraków

    • Medical University of Silesia, Katowice

    • Institute of Oncology, Gliwice

Agilent genespring gx

Agilent GeneSpring GX

  • RDP:

  • Used with Windows Remote Desktop

  • Integrated with the DNA Integromics Platform for uniform microarray files management

  • 5-year, single-seat license for all registered Polish scientists

  • Manual:

Galaxy ngs server 2 4

Galaxy NGS Server (2/4)

”Galaxy is an open, web-based platform for data intensive biomedical research.”

  • Goal: deploy high-performance, high-throughput NGS data analysis solution on top of HPC resources for PL-Grid users

  • Needs a lot of adjustments and in-house add-on development

  • Work started 12.2013, and still at a beta stage…  - but accessible to anyone willing to test and to help

  • Planned integrated tools (list not closed): GATK, SAMtools, Bowtie, TopHat, BWA, bedtools, Cufflinks, Picard, SnpEff/SnpSift, Flexbar, FastQC, MACS

  • Targeted platforms: Illumina *Seq, Ion Proton, Roche 454

Galaxy ngs server 3 4

Galaxy NGS Server (3/4)

  • Notable features

    • Full integration with Zeus cluster and disk arrays

    • PBS and MQ system for effective job queuing

    • Secured environment (open for all PL-Grid users, not ”public”)

    • All major Galaxy features (history, sharing, viewers)

  • Well documented workflows designed by NGS experts

    • Basics (alignment and quality control, trimming, filtering)

    • DNA-Seq, RNA-Seq, variant calling, SNP calling, methylation, exome analysis with annotations

  • Manual:

  • Cooperation

    • Institute of Pharmacology, Polish Academy of Sciences, Kraków

    • OMICRON, Jagiellonian University Medical Collage, Kraków

    • National Research Institute of Animal Production, Kraków-Balice

Galaxy ngs server 4 4

Galaxy NGS Server (4/4)

  • Current challenges

    • Some security issues in the Galaxy code prevent the production deployment

    • Cluster integration is there, yet rather unstable and prone to fail (quite an intricate contraption, it is)

    • Broad variety of integrated tools and wrappers does not help

  • Call to action – who is needed

    • Users: the bigger the community, the easier to make us visible

    • Early adopters: tell us what you need, help us test and integrate the tools and workflows you use

    • Programmers: if you’d like to help us bring a dedicated HPC-powered Galaxy for Polish scientists, any assistance is greatlyappreciated

    • Contact: [email protected]

Links contact partners

Links, Contact, Partners

  • These resources, services and tools (and much more) are available after registering to PL-Grid

  • PL-Grid User Manual

    • (PL)

    • (EN)

  • Questions, problems, requests about PL-Grid

  • Contact for LifeScience domain services