1 / 17

Galaxy (of bioinformatics) Overview

Galaxy (of bioinformatics) Overview. Martin Senger <martin.senger@kaust.edu.sa> [using also few slides from the presentations from the Galaxy Developers Conference 2011]. Three basic themes. What Galaxy can do... (or could do) Show me where I can try for myself

dougal
Download Presentation

Galaxy (of bioinformatics) Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Galaxy(of bioinformatics)Overview Martin Senger <martin.senger@kaust.edu.sa> [using also few slides from the presentations from the Galaxy Developers Conference 2011]

  2. Three basic themes... • What Galaxy can do... (or could do) • Show me where I can try for myself • What can we do to make our Galaxy better ...and what this is not • a detailed tutorial how to use Galaxy • a way to convince you that I understand everything about Galaxy

  3. What is Galaxy • A web-based interface to the command-line tools (of any kind) and their combinations (“workflows”) • Galaxy performs analysis interactively through the web, on arbitrarily large datasets • Galaxy remembers what it did - history • Flexibility to include anybody’s command-line tools • by writing wrappers whose templates are available • An environment for sharing tools (or their wrappers) • “Tools Shed” repository

  4. How does it look like?

  5. Galaxy has tools...

  6. Galaxy has data... (well, “datasets”) • Locally stored data • user-specific • shared between users • e.g. genome builds • Origin of data • uploaded data from your computer • using a web interface • using an FTP server • fetched from external databases (“datasources”) • only those that are “aware” of Galaxy • internally: two ways how to fetch data (async vs. sync.) • you need to be familiar with these databases and their UIs

  7. Datasource – an example 1 2 3

  8. Galaxy has data...and data have types • Data have metadata • allowing to use data only for those tools that recognize such data types • Data have attributes • annotate data • convert data to a new format • change data type

  9. Galaxy can also send data out...

  10. Finally, Galaxy can do workflows... • Automated set of steps – perhaps each time with different input data (of the same type) • reproducibility (usable in publications) • reusability (sharing workflows with others) • created from the scratch (using a workflow editor) or from your history

  11. An example – a workflow editor Thanks to:

  12. Users creating non-trivial workflows user would not have done this from the command line on our cluster

  13. http://main.g2.bx.psu.edu/screencast • If we have time (6mins) click here: • Creating a workflow from your history

  14. There are many ways to skin a cat... • Where are all these galaxies? • public servers • available immediately, free of charge • http://main.g2.bx.psu.edu/ • and few others, such as http://galaxy.nbic.nl/ • usually limited resources • you cannot customize them to your special needs • KAUST/CBRC Galaxy • http://galaxy.cbrc.kaust.edu.sa/ • running on an internal cluster with limited resources • but we can do with it whatever we need to do • Galaxy in the Amazon clouds (CloudMan) • when you do not have infrastructure in house • when you have particular resource (cores, memory...) needs • when you need a customization • if you have a credit card • details in this presentation: • http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFile&do=get&target=CloudManGalaxyOnTheCloud.pdf • Galaxy has also the RESTfull API for programmatic access (beta)

  15. Our KAUST/CBRC GalaxyThere’s no such thing as a free lunch... ...we need to: Image courtesy of http://mychinaconnection.com/english-proverb/there-is-no-free-lunch/

  16. How to make a better use of our Galaxy • Data issues • add genome-wise data we (CBRC) need • add data usable for others (Core, students...) • Tools • make a subset of tools we really need and test them fully • consider to wrap other tools (not yet available by default) • Logistics • provide user-oriented courses • create a user group to share experience and to promote knowledge • monitor its stability and usage • Hardware/sysadmin issues • Install it on better hardware (in due time) • Change the current queue priority (a chicken-egg problem) • Add an ftp server

  17. Thank you. Any questions please? More info: • Galaxy home page: • http://galaxy.psu.edu/ • An overview presentation: • http://wiki.g2.bx.psu.edu/GCC2011?action=AttachFile&do=get&target=IntroductionSession.pdf

More Related