1 / 33

Public Bioinformatics Services from the EBI

Public Bioinformatics Services from the EBI. Rodrigo Lopez. Main Priorities. Providing access to comprehensive information resources in bioinformatics. Database searching Homology searching Sequence analysis 3D Structural analysis MicroArrays High availability. Goals.

irving
Download Presentation

Public Bioinformatics Services from the EBI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Public Bioinformatics Services from the EBI Rodrigo Lopez

  2. Main Priorities • Providing access to comprehensive information resources in bioinformatics. • Database searching • Homology searching • Sequence analysis • 3D Structural analysis • MicroArrays • High availability

  3. Goals • Provide support (support@ebi.ac.uk) and assist with training (2can project). • Integrate the activities of various groups at the EBI. • In particular at the level of the web services. • Infrastructure planning and sharing of all External Services hardware and human resources.

  4. Data Resources DNAsequences Ontologies Proteinsequences Pat.Abst(*). Genomes Functional patterns Literature Proteomes GeneExpression Metabolicpathways Proteinstructures (*) SOON

  5. Homology Searches • Three classical applications: • Fasta, Wu-blast 2.0 and NCBI Blast 2.x • Advanced and very sensitive protein S&W homology searches available with: • MPsrch and Scanps • Identification of protein function: • InterProScan, FingerPrintScan, ppsearch... • 3D structure comparison using DALI/SSM/PQS/Ligand, etc.

  6. WU-Blast2 • New Noteworthy improvement: • SENSITIVITY Control • Higher sensitivity means slower runs. • SWALL:1 sec (900K+ sequences). • EMBL:20 secs.(taxdiv)

  7. Genomes & Proteomes • Similarity and homology searches are available using Fasta (ca. 100+ Archea, bacterial and eukaryotic genomes & proteomes to date). • Specialised Blast servers for Parasite genomes and vector screening (EVEC). • Fasta server for SNP scanning (HGVBASE) • Ensembl blast servers for metazoan genomes and proteomes.

  8. Database Searching • Main service is based on SRS 6.x. • 150+ public databases are available. • More than 50 million records are searchable. • Bi-directional and multi-step links allow queries at the following levels: • A > B, A > B > C, A > C, etc…

  9. Analysis Tools • Range from single nucleotide and protein sequence tools to MSA and phylogenetic tools (ClustalW and AMAS). • Gene prediction (GeneMark) • Function/Pattern identification (InterProScan, CpG anlysis, Radar, PRATT...) • Large scale analysis of genomes (GeneQuiz) • Large scale analysis of proteomes (HPI, PA) • Bioinformatic application workbenches: w2h and AppLab for GCG and EMBOSS • http://www.ebi.ac.uk/Tools/

  10. Improvements: • Faster • Larger sets • Better tree views ClustalW

  11. 2D & 3D Structural Analysis • Comparison of protein structures in 2D (DALI/SSM) • Fold classification (DSSP, HSSP, FSSP) • Quaternary structure comparisons (PQS) • 3D sequence alignment services (3Dseq)

  12. MicroArrays • ArrayExpress • MiameExpress • Expression Profiler

  13. EBI services targets • Provide access to data and tools that can: • Describe disease ethiology. • Risk assessment. • Identification of drug targets. • Disease prevention. • Molecular definition of disease.

  14. Impact on human medicine • Cystic fibrosis. • Huntingtons disease. • Myotonic dystrophy. • Cancer. • Alzheimer’s. • Malaria.

  15. EBI services in biology • Describe populations and interactions. • Molecular Ecology. • Genetic variation. • Population management. • Genomics and Biodiversity.

  16. High Availability • Jobs run in a highly heterogeneous computer environment: • Large SMP servers (SGI and HP/Compaq) • Linux based IBM PC farms • ca. 260 CPU’s (minimum dual CPU host) • Effective load sharing and balancing using Platform’s LSF. • Updating and rollover is achieved using ultra-efficient in-house developed tools.

  17. StaticContent LSF Request Brokers BigIP CGIServletsJSPWS (SOAP)

  18. ESEBI (spanish for ‘it’s EBI’) • Is in effect a transparent GRID for providing free computational services to the community. • Highly adaptable and exportable. • High degree of maintainability and reliability (24:7 solution).

  19. Job management

  20. …..addresses efficiently the job scheduling problem…

  21. …allows efficient job manipulation…

  22. …allows fast job requeueing… …and rescue…

  23. …permits job relocation… …and job resource (re)allocation…

  24. …but we are NOT perfect!

  25. Typical ES EBI LSF host characteristics • ‘See all/Share ALL FS’: • SAN and NFS Caches. • Local storage (ca. 100Gb/host) for outage resolution. • Hosts can be removed/added to a farm for maintenance without affecting the service. • Hosts can be added/removed from LSF queues as demand rises/falls.

  26. REDUNDANCY! (example:InterProScan)

  27. EBI Network news • Network upgrade • to 1Gb/sec (Hinxton - Cambridge) • from two redundant 34Mb/sec (Hinxton - London) • Acquisition of more SMP servers and expansion of the current production and External Services HP/Compaq cluster as well as the IBM PC farms.

  28. ESEBI utilisation • 4 million hits/pages a month (excludes images) • More than 50K request on the SRS servers per day. • Close to 1 million job requests per month. • Current traffic averages 5Mb/sec.

  29. New TechnologiesESEBI future • Web Services using SOAP/CORBA/RMI/LDAP • SOAPLab,WSE,WSRS,etc. • Provide programmatic access to biological data. • Provide programmatic access to bioinformatic applications as well.

  30. 3 Web designers: Stephen Robinson Asif Kibria Gulam Patel 4 Application developers: Ville Silventoinen Sharmila Pillai Emmanuel Quevillon Adam Lowe. 3 Support (HelpDesk): Karen Duggan Rob Harper Tamara Kulikova SRS Managers: Nicola Harte (SRS) Group Leader: Rodrigo Lopez The group

More Related