Genomic data analysis services available for pl grid users
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

Genomic Data Analysis Services Available for PL-Grid Users PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on
  • Presentation posted in: General

Genomic Data Analysis Services Available for PL-Grid Users. Tomasz Waller, Tomasz Gubała , Kazimierz Murzyn. Academic Computer Centre Cyfronet AGH, cyfro.net Klaster LifeScience Kraków , lifescience.pl. Recent Advances in Omics Research, Kraków, October 2014.

Download Presentation

Genomic Data Analysis Services Available for PL-Grid Users

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Genomic Data Analysis ServicesAvailable for PL-Grid Users

Tomasz Waller, Tomasz Gubała, Kazimierz Murzyn

Academic Computer Centre Cyfronet AGH, cyfro.net

KlasterLifeScienceKraków, lifescience.pl

Recent Advances in Omics Research, Kraków, October 2014


ACC Cyfronet AGH andPL-Grid Infrastructure

Academic Computer Centre Cyfronet AGH

  • Established in 1973 (40 years of experience)

  • Provides network, computational power and data storage capabilities for Polish science

    • ~374 TFlops (zeus, [email protected]), 2.5 PB (disks)and 3.5 PB (tapes)

    • 1.7 PFlops (prometheus) with 10 PB of disks,expected first half of 2015

    • Regular and bigmem nodes, vSMP, GPGPU, FPGA,MPI over Infiniband

    • Details: http://kdm.cyfronet.pl/

      PL-Grid Infrastructure for Polish science

  • Five computing centers with Cyfronet asthe consortium leader

  • Total: ~588 TFlops and ~5.6 PB (disks) butsoon to grow considerably (see above)

  • Available free of charge to all Polish scientistsand their foreign collaborators

  • Details: http://www.plgrid.pl


Using PL-Grid Infrastructure

  • Register at https://portal.plgrid.pl

    • User verification process based on Polish OPI number

    • Assistants and foreigners are confirmed by Polish PIs

    • Variety of basic and higher level services available after login

  • Local SSH access, cloud computing, middlewares

  • Considerable library of installed applications

    • GATK, MACS, SAMTools, Picard, TopHat, Bowtie, (p)BWA, R/Bioconductor, AutoDock/AutoGrid, BLAST, Clustal, CPMD, Gromacs, NAMD, Matlab, Mathematica …

    • Free to compile and install own applications using the shell login

    • Possibility to use own commercial licenses on HPC resources

  • Specific services dedicated to the Life Science domain


DNA Microarray Integromics Analysis Platform (1/2)

https://lifescience.plgrid.pl/

  • For people who perform biological investigations using DNA microarrays

  • Goal: help to analyze gene expression information and correlate it with other clinical data

  • Analyses available now: normalization, clustering, SAM, T-test, GO-based enrichment, ANNs, PCA, panel filtering

  • ’Integromics’ analyses in ’beta’ (testing) stage

    • CCA, PLS (gene expression and lipidomics)

    • Roleswitch, TargetScore (gene expression and miRNA)

  • Still in continuous development (Pathways, EBI export etc.)

  • Supported models: some Affymetrix, AgilentSurePrint (addingsupport for others is possible, in case of demand)


DNA Microarray Integromics Analysis Platform (2/2)

  • Notable features

    • Integration with EBI ArrayExpress (import, MIAME)

    • Sharing experiments with others

    • Importing own data for further analysis

    • Supported languages: PL, EN

  • Manual: https://docs.cyfronet.pl/x/JpaZ

  • Cooperation

    • Jagiellonian University Medical Collage, Kraków

    • Medical University of Silesia, Katowice

    • Institute of Oncology, Gliwice


Agilent GeneSpring GX

  • RDP: genespring.plgrid.pl

  • Used with Windows Remote Desktop

  • Integrated with the DNA Integromics Platform for uniform microarray files management

  • 5-year, single-seat license for all registered Polish scientists

  • Manual: https://docs.cyfronet.pl/x/JIq1


Galaxy NGS Server (1/4)


Galaxy NGS Server (2/4)

https://galaxy.plgrid.pl/

”Galaxy is an open, web-based platform for data intensive biomedical research.”

  • Goal: deploy high-performance, high-throughput NGS data analysis solution on top of HPC resources for PL-Grid users

  • Needs a lot of adjustments and in-house add-on development

  • Work started 12.2013, and still at a beta stage…  - but accessible to anyone willing to test and to help

  • Planned integrated tools (list not closed): GATK, SAMtools, Bowtie, TopHat, BWA, bedtools, Cufflinks, Picard, SnpEff/SnpSift, Flexbar, FastQC, MACS

  • Targeted platforms: Illumina *Seq, Ion Proton, Roche 454


Galaxy NGS Server (3/4)

  • Notable features

    • Full integration with Zeus cluster and disk arrays

    • PBS and MQ system for effective job queuing

    • Secured environment (open for all PL-Grid users, not ”public”)

    • All major Galaxy features (history, sharing, viewers)

  • Well documented workflows designed by NGS experts

    • Basics (alignment and quality control, trimming, filtering)

    • DNA-Seq, RNA-Seq, variant calling, SNP calling, methylation, exome analysis with annotations

  • Manual: https://docs.cyfronet.pl/x/voas

  • Cooperation

    • Institute of Pharmacology, Polish Academy of Sciences, Kraków

    • OMICRON, Jagiellonian University Medical Collage, Kraków

    • National Research Institute of Animal Production, Kraków-Balice


Galaxy NGS Server (4/4)

  • Current challenges

    • Some security issues in the Galaxy code prevent the production deployment

    • Cluster integration is there, yet rather unstable and prone to fail (quite an intricate contraption, it is)

    • Broad variety of integrated tools and wrappers does not help

  • Call to action – who is needed

    • Users: the bigger the community, the easier to make us visible

    • Early adopters: tell us what you need, help us test and integrate the tools and workflows you use

    • Programmers: if you’d like to help us bring a dedicated HPC-powered Galaxy for Polish scientists, any assistance is greatlyappreciated

    • Contact: [email protected]


Links, Contact, Partners

  • These resources, services and tools (and much more) are available after registering to PL-Grid

    https://portal.plgrid.pl/

  • PL-Grid User Manual

    • https://docs.plgrid.pl/podrecznik_uzytkownika (PL)

    • https://docs.plgrid.pl/display/PLGDoc/User+manual (EN)

  • Questions, problems, requests about PL-Grid

    • https:[email protected]

  • Contact for LifeScience domain services

    • [email protected]


  • Login