D ludviga imcs ul sigmanet
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

Overview of application CoPS ( C omparison o f P rotein S tructures) PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

D.Ludviga IMCS UL (SigmaNet). Overview of application CoPS ( C omparison o f P rotein S tructures). Outline. About CoPS ( scientific value); What's new?; Challenges (mentioned during 1AHM); Our solution; Collaboration possibilities. About CoPS (scientific value).

Download Presentation

Overview of application CoPS ( C omparison o f P rotein S tructures)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


D ludviga imcs ul sigmanet

D.Ludviga

IMCS UL (SigmaNet)

Overview of application CoPS (Comparison of Protein Structures)

2nd BG-II AHM, 13.05.2009, Riga, Latvia


Outline

2nd BG-II AHM, 13.05.2009, Riga, Latvia

Outline

  • About CoPS (scientific value);

  • What's new?;

  • Challenges (mentioned during 1AHM);

  • Our solution;

  • Collaboration possibilities.


About cops scientific value

2nd BG-II AHM, 13.05.2009, Riga, Latvia

About CoPS (scientific value)

  • Started at the beginning of BG-II as the pilot application;

    • developed by Dr. Natalja Kurbatova and Asoc. Prof. Juris Viksna

  • Field – Bioinformatics;

    “It has taken biologists some 230 years to identify and describe three quarters of a million insects; if there are indeed at least thirty million ... then, working as they have in the past, insect taxonomists have ten thousand years of employment ahead of them.”

    R.Leakey and L.Roger


About cops

2nd BG-II AHM, 13.05.2009, Riga, Latvia

About CoPS

  • Assumption - protein structures have evolved by a stepwise process, each step involving a small change in the structure.

  • Comparison of protein structures using Evolutionary Secondary Structures Matching (ESSM) algorithm

    • ESSM was created for pair wise comparison of structures that allow to identify fold mutations and to estimate evolutionary relationship between proteins.

  • For exploration of evolutionof protein structures all-against-all comparison have to be done

  • Application needs:

    • Protein data base (data set description files are stored)

      • PDB (3D), FASTA (.txt), structural elements;

      • size ~8 GB (~2.3GB if compressed);

    • Total number of tasks - 20 451 945, divided in 410 files


About cops1

2nd BG-II AHM, 13.05.2009, Riga, Latvia

About CoPS

  • Application consists of:

    • jdl.essm - JDL file for submitting ESSM (CoPS) job

    • essm.sh - shell script that is executed on WN once the job starts

    • database.tar.gz - archive of the protein database with protein descriptions, which is extracted on the WN before anything else starts

    • essm.linux - statically compiled executable for ESSM(CoPS) that works on Scientific Linux [CERN] 4, 32-bit binary

    • pairs.txt - sample calculation file that contains pair comparisons

    • At the end of each job result file pairs.result is generated

  • Afterwards visualized using a self made tool.

    • developed using one of GRADE components


About cops2

2nd BG-II AHM, 13.05.2009, Riga, Latvia

About CoPS


Whats new

2nd BG-II AHM, 13.05.2009, Riga, Latvia

Whats new?

  • Developed (results received);

    • ~2 weeks.

  • Implemented in Migrating Desktop;

  • Presented/demonstrated on OGF25/EGEE Users Forum in Catania, Italy

  • Demo


Challenges and our solution

2nd BG-II AHM, 13.05.2009, Riga, Latvia

Challenges and our solution

  • Challenges:

    • Transport the data;

      • 410 x 2.3GB ≈ 950GB

    • VOMS-proxy.

  • Solutions

    • The needed data was installed on separate clusters software directories (developed “devoted” protein clusters)

    • Myproxy


Results

2nd BG-II AHM, 13.05.2009, Riga, Latvia

Results

  • The results of the ESSM algorithm were successfully used for the exploration of theCATH fold space by using fold space graphs for representation of comparison results and estimation of "evolution distance" on the basis of observed changes.

  • The results obtained in the application can be represented as a few steps toward the creation of an general protein evolution model.


Collaboration

2nd BG-II AHM, 13.05.2009, Riga, Latvia

Collaboration

“Computer science is no more about computers than astronomy is about telescopes”

E.W.Dijkstra

  • Continue collaboration with biologists in LU;

  • Develop an VO or just devoted servers:

    • PDB can be installed on a clusters VO software directory

      • To speed up execution of jobs and avoid per-job download and extraction of these databases.


Thank you

Thank you!

2nd BG-II AHM, 13.05.2009, Riga, Latvia


  • Login